
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>Apple Machine Learning Research</title>
      <link>https://machinelearning.apple.com</link>
      <description>Apple machine learning teams are engaged in state of the art research in machine learning and artificial intelligence. Learn about the latest advancements.</description>
      <language>en</language>
      <lastBuildDate>Mon, 08 Jun 2026 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://machinelearning.apple.com/rss.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>introducing-third-generation-of-apple-foundation-models</guid>
    <title>Introducing the Third Generation of Apple’s Foundation Models</title>
    <link>https://machinelearning.apple.com/research/introducing-third-generation-of-apple-foundation-models</link>
    <description>Our next generation of Apple Intelligence is centered around our users, integrated deeply into our operating systems, and powered by a bold new architecture with privacy at its core.
At the heart of this architecture is our third generation of Apple Foundation Models (AFM), a family of five foundation models custom-built in collaboration with Google. These span from on-device models to server-based models running on Private Cloud Compute.
Apple Foundation Models are built to unlock a wide range of helpful experiences for our users, like an entirely new Siri and intelligent tools that make…</description>
    <pubDate>Mon, 08 Jun 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>apple-at-cvpr-2026</guid>
    <title>IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026</title>
    <link>https://machinelearning.apple.com/updates/apple-at-cvpr-2026</link>
    <description><p>Apple is presenting new research at the annual <a href="https://cvpr.thecvf.com" target="_blank" aria-label="IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - Opens in a new window" class="icon icon-after icon-external" rel="noopener nofollow">IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</a>, which takes place in person in Denver at the Colorado Convention Center from June 3 to June 7.</p>
<p>We are proud to sponsor the conference, which brings together the scientific and industrial research communities in computer vision and pattern recognition. Below is an overview of Apple’s participation at CVPR 2026.</p></description>
    <pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>vsas-bench-streaming-assistant</guid>
    <title>VSAS-Bench: Real-Time Evaluation of Visual Streaming Assistant Models</title>
    <link>https://machinelearning.apple.com/research/vsas-bench-streaming-assistant</link>
    <description>Streaming vision-language models (VLMs) continuously generate responses given an instruction prompt and an online stream of input frames. This is a core mechanism for real-time visual assistants. Existing VLM frameworks predominantly assess models in offline settings. In contrast, the performance of a streaming VLM depends on additional metrics beyond pure video understanding, including proactiveness, which reflects the timeliness of the model’s responses, and consistency, which captures the robustness of its responses over time. To address this limitation, we propose VSAS-Bench, a new…</description>
    <pubDate>Fri, 22 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>epicache</guid>
    <title>EpiCache: Episodic KV Cache Management for Long-Term Conversation on Resource-Constrained Environments</title>
    <link>https://machinelearning.apple.com/research/epicache</link>
    <description>Modern large language models (LLMs) extend context lengths to millions of tokens, enabling coherent, personalized responses grounded in long conversational history. However, the Key-Value (KV) cache grows linearly with the extended dialogue history, causing the model’s memory footprint to quickly exceed device limits. While recent KV cache compression methods attempt to reduce memory usage, most apply cache eviction after processing the entire context, incurring unbounded peak memory usage. Additionally, query-dependent eviction narrows the cache semantics to a single query, leading to failure…</description>
    <pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>balcaprl-mllm-image-captioning</guid>
    <title>BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning</title>
    <link>https://machinelearning.apple.com/research/balcaprl-mllm-image-captioning</link>
    <description>Image captioning is one of the most fundamental tasks in computer vision. Owing to its open-ended nature, it has received significant attention in the era of multimodal large language models (MLLMs). In pursuit of ever more detailed and accurate captions, recent work has increasingly turned to reinforcement learning (RL). However, existing captioning-RL methods and evaluation metrics often emphasize a narrow notion of caption quality, inducing trade-offs across core dimensions of captioning. For example, utility-oriented objectives can encourage noisy, hallucinated, or overlong captions that…</description>
    <pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>gaussian-head-reconstruction</guid>
    <title>Large-Scale High-Quality 3D Gaussian Head Reconstruction from Multi-View Captures</title>
    <link>https://machinelearning.apple.com/research/gaussian-head-reconstruction</link>
    <description>We propose HeadsUp, a scalable feed-forward method for reconstructing high-quality 3D Gaussian heads from large-scale multi-camera setups. Our method employs an efficient encoder-decoder architecture that compresses input views into a compact latent representation. This latent representation is then decoded into a set of UV-parameterized 3D Gaussians anchored to a neutral head template. This UV representation decouples the number of 3D Gaussians from the number and resolution of input images, enabling training with many high-resolution input views. We train and evaluate our model on an…</description>
    <pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>rvpo-risk-sensitive-alignment</guid>
    <title>RVPO: Risk-Sensitive Alignment via Variance Regularization</title>
    <link>https://machinelearning.apple.com/research/rvpo-risk-sensitive-alignment</link>
    <description>Current critic-less RLHF methods aggregate multi-objective rewards via an arithmetic mean, leaving them vulnerable to constraint neglect: high-magnitude success in one objective can numerically offset critical failures in others (e.g., safety or formatting), masking low-performing “bottleneck” rewards vital for reliable multi-objective alignment. We propose Reward-Variance Policy Optimization (RVPO), a risk-sensitive framework that penalizes inter-reward variance during advantage aggregation, shifting the objective from “maximize sum” to “maximize consistency.” We show via Taylor expansion…</description>
    <pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>ppml-2026</guid>
    <title>Apple Workshop on Privacy-Preserving Machine Learning & AI 2026</title>
    <link>https://machinelearning.apple.com/updates/ppml-2026</link>
    <description>At Apple, we believe privacy is a fundamental human right.  As AI capabilities increase and become more integrated into people’s daily lives, advancing research in privacy-preserving techniques is increasingly important to ensure privacy is protected while users enjoy innovative AI experiences.
Apple’s fundamental research has consistently pushed the state-of-the-art in this domain, and earlier this year, we hosted the Workshop on Privacy-Preserving Machine Learning &#x26; AI. This two-day event brought together Apple researchers and members of the broader research community to discuss the…</description>
    <pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>velox</guid>
    <title>Velox: Learning Representations of 4D Geometry and Appearance</title>
    <link>https://machinelearning.apple.com/research/velox</link>
    <description>We introduce a framework for learning latent representations of 4D objects which are descriptive, faithfully capturing object geometry and appearance; compressive, aiding in downstream efficiency; and accessible, requiring minimal input, i.e., an unstructured dynamic point cloud, to construct. Specifically, Velox trains an encoder to compress spatiotemporal color point clouds into a set of dynamic shape tokens. These tokens are supervised using two complementary decoders: a 4D surface decoder, which models the time-varying surface distribution capturing the geometry; and a Gaussian decoder…</description>
    <pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate>
  </item>

  <item>
    <guid>compression</guid>
    <title>What Matters in Practical Learned Image Compression</title>
    <link>https://machinelearning.apple.com/research/compression</link>
    <description>One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be optimized directly to appeal to the human visual system. Despite this potential, a perceptual yet practical image codec is yet to be proposed. In this work, we aim to close this gap. We conduct a comprehensive study of the key modeling choices that govern the design of a practical learned image codec, jointly optimized for perceptual quality and runtime — including within the ablations several novel techniques. We then perform performance-aware neural…</description>
    <pubDate>Thu, 07 May 2026 00:00:00 GMT</pubDate>
  </item>

    </channel>
  </rss>
