Multimodal Item Classification fully based on Transformers