AMiner - AI赋能科技情报挖掘-学术搜索-论文检索-论文专利-文献追踪-学者画像

Chrome Extension

WeChat Mini Program

Use on ChatGLM

Academic Profile User Profile

My Following Paper Collections Browse History

AI Reads Science

GPT, Language Model, Human Feedback, CLIP, LLaMA

57,339,193

Researchers

310,164,529

Publications

8,932,367

Concepts

2,215,767,866

Citations

Explore

Report

Trend

Input keywords, let AI filter and summarize latest papers

The following are popular content recommendations, and the recommendations are more accurate after adding subscriptions

Popular Recommendation

Popular Viewed Papers&Topics

This paper introduces a new technique called SparQ Attention, which can significantly reduce the memory bandwidth requirements of generative large language models during inference, thereby improving the throughput of LLM inference.

SparQ Attention: Bandwidth-Efficient LLM Inference

Luka Ribar,Ivan Chelombiev, Luke Hudlass-Galley,Charlie Blake, Carlo Luschi, Douglas Orr

CoRR （2023）

Cited0Views9567

Download

Bibtex

ChatPaper

Rate

9567

Scaling up the size of vision models has become a practical trend to obtain more powerful visual representations. But is "bigger" always "better" in the future? This paper discusses the aspects of larger vision models that may not be necessary.

When Do We Not Need Larger Vision Models