Microsoft Research’s Post

View organization page for Microsoft Research, graphic

279,591 followers

YOCO is a novel decoder-decoder architecture for LLMs, enhancing memory efficiency by caching key-value pairs only once. YOCO markedly reduces KV cache memory and prefilling time by orders of magnitude. YOCO makes 1M-length LLMs practical. https://msft.it/6040YnEVM

3 Comments

Berowne Hlavaty

Senior Quant Analyst at J.P. Morgan

Innovative, but I do wonder what happens when a word late in the text changes the context of an earlier word. example: "I was at the farm then I went to the store and bought an apple but I was disappointed when I found the M4 chip was only available in the ipad, and they still don't offer touch screens on laptops."

Grigory Sapunov

Thanks for the work! I've made an overview of the paper https://gonzoml.substack.com/p/you-only-cache-once-decoder-decoder

Ömer A.

Technical Lead Generative AI / Senior Data Scientist / AI Consultant at Lufthansa Group

Jon-Paul Boyd

See more comments

To view or add a comment, sign in

More Relevant Posts

Somasundaram A

Assistant Product manager at Avnet (APAC/ EMEA /ANZ -Industrial and automation process control )
11mo
Report this post
Use it to experiment with and learn the AMD #Zynq UltraScale+ architecture. Write your review and the kit is yours to keep! https://lnkd.in/gPdh8PSV
Like Comment
To view or add a comment, sign in
Sindhu BN

Assistant Product Manager
1y
Report this post
Sign up to review the #ZUBoard 1CG development kit from Avnet! Use it to experiment with and learn the AMD #Zynq UltraScale+ architecture. Write your review and the kit is yours to keep! https://lnkd.in/gUmYQdK4
Like Comment
To view or add a comment, sign in
Jisu Park

Computer Hardware Engineer
8mo Edited
Report this post
Oct 10,2023 SimplePipelineCPU micro architecture 0.1.1 Decoder in systemverilog It is unfinished but created the outline for RV32I ISA. https://lnkd.in/gK-yv8rc
Like Comment
To view or add a comment, sign in
Pranab Ghosh

AI Consultant || MIT Alumni || Entrepreneur || Open Source Project Owner || Blogger
1mo
Report this post
Intriguing! This seems to imply that the network architecture in a DL model doesn’t matter. Given the same number of parameters, different NN models converge to more or less same performance. “this may imply that as long as there are enough parameters, and things are reasonably well-conditioned (i.e. a decent number of nonlinearities and and connections between the pieces) then it really doesn't matter how you arrange them, i.e. any sufficiently good architecture works just fine i feel there's something really deep here, and we may be already very close to the upper bound of how well we can approximate a given function given a certain amount of compute” #deeplearning https://lnkd.in/g55TQ-mv

6 Comments
Like Comment
To view or add a comment, sign in
Dan Sohayda

Solutions Architect at VMware by Broadcom Software
3w
Report this post
Take a 3-minute break and learn all the benefits of VMware vSAN Express Storage Architecture from this quick video.

Key Benefits of vSAN Express Storage Architecture (ESA)

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Borin Phy

Project Manager at Farnell Global | PMP, PSM
1mo
Report this post
This #element14Community webinar introduces the AMD Versal architecture and it will explore the differences between the various families of AMD Versal devices, as well as delving into the different development frameworks and entry points available depending on the target application. You can save your seat now: https://lnkd.in/gF5t6pdD
Like Comment
To view or add a comment, sign in
Preethi H.M

Product Author at element14 Electronics
1mo
Report this post
This #element14Community webinar introduces the AMD Versal architecture and it will explore the differences between the various families of AMD Versal devices, as well as delving into the different development frameworks and entry points available depending on the target application. You can save your seat now: https://lnkd.in/dKZat3ur
Like Comment
To view or add a comment, sign in
Yvette Jambon

Outside Account Manager at Newark Electronics
1mo
Report this post
This #element14Community webinar introduces the AMD Versal architecture and it will explore the differences between the various families of AMD Versal devices, as well as delving into the different development frameworks and entry points available depending on the target application. You can save your seat now: https://lnkd.in/gAMzH6fx
Like Comment
To view or add a comment, sign in
Anil Kumar M

BI - Manager at element14
3w
Report this post
This #element14Community webinar introduces the AMD Versal architecture and it will explore the differences between the various families of AMD Versal devices, as well as delving into the different development frameworks and entry points available depending on the target application. You can save your seat now: https://lnkd.in/gNY8XhxX
Like Comment
To view or add a comment, sign in

279,591 followers

View Profile Follow

Microsoft Research’s Post

More from this author

AI4Science AMA (Ask Us Anything) featuring Chris Bishop, Bonnie Kruft, and Max Welling

Explore topics

Microsoft Research’s Post

More Relevant Posts

Key Benefits of vSAN Express Storage Architecture (ESA)

https://www.youtube.com/

More from this author

AI4Science AMA (Ask Us Anything) featuring Chris Bishop, Bonnie Kruft, and Max Welling

Explore topics