Книга Multimodal AI Systems Wei Sun

Multimodal AI Systems

Architectures, Training, and Applications

Автор: Wei Sun
Език: Английски език
Корици: С меки корици
Издател: Independently published
Наличност: Очаква се зареждане
Издание 30. 06. 2026
73.88 144.50 лв
The Transformer Principles Series is a three-volume graduate-level treatise that builds a complete m...

Информация за книгата

Автор
Език
Английски език
Корици
Книга - С меки корици
Издадена
2026
страници
480
EAN
9798184326054
Enbook ID
53025974
Издател
Теглоt
1104
Размери
216 x 280 x 25

Пълно описание

The Transformer Principles Series is a three-volume graduate-level treatise that builds a complete mathematical and engineering understanding of modern AI systems, from the foundational attention mechanism to large language models and multimodal architectures.

Volume III - Multimodal AI Systems: Architectures, Training, and Applications extends the Transformer paradigm beyond text into vision, audio, and video. It covers modality-specific encoders and tokenizers, cross-modal fusion and contrastive alignment (CLIP, SigLIP), diffusion and flow-matching generative models, vision-language architectures (ViT, LLaVA, Q-Former), text-to-image and text-to-video generation, speech and audio processing, efficient inference for multimodal models, long-context scaling, and reasoning agents that perceive and act across modalities.