| XLA (Accelerated Linear Algebra) | |
|---|---|
| Developer | OpenXLA |
| Repository | xla on GitHub |
| Written in | C++ |
| Operating system | Linux, macOS, Windows |
| Type | compiler |
| License | Apache License 2.0 |
| Website | openxla |
XLA (Accelerated Linear Algebra) is an open-source compiler for machine learning developed by the OpenXLA project.[1] XLA is designed to improve the performance of machine learning models by optimizing the computation graphs at a lower level, making it particularly useful for large-scale computations and high-performance machine learning models. Key features of XLA include:[2]
- Compilation of Computation Graphs: Compiles computation graphs into efficient machine code.
- Optimization Techniques: Applies operation fusion, memory optimization, and other techniques.
- Hardware Support: Optimizes models for various hardware, including CPUs, GPUs, and NPUs.
- Improved Model Execution Time: Aims to reduce machine learning models' execution time for both training and inference.
- Seamless Integration: Can be used with existing machine learning code with minimal changes.
XLA represents a significant step in optimizing machine learning models, providing developers with tools to enhance computational efficiency and performance.[3][4]
OpenXLA Project
OpenXLA Project is an open-source machine learning compiler and infrastructure initiative intended to provide a common set of tools for compiling and deploying machine learning models across different frameworks and hardware platforms. It provides a modular compilation stack that can be used by major deep learning frameworks like JAX, PyTorch, and TensorFlow. The project focuses on supplying shared components for optimization, portability, and execution across CPUs, GPUs, and specialized accelerators. Its design emphasizes interoperability between frameworks and a standardized set of representations for model computation.
Components
The OpenXLA ecosystem includes several core components:
- XLA – A deep learning compiler that optimizes computational graphs for multiple hardware targets.
- PJRT – A runtime interface that allows different back-ends to connect to XLA through a consistent API.
- StableHLO – A high-level operator set intended to serve as a stable, portable representation for ML models across compilers and frameworks.
- Shardy – An MLIR-based system for describing and transforming models that run in distributed or multi-device environments.
- Additional profiling, testing, and integration tools maintained under the OpenXLA organization.
Users and adopters
Several machine learning frameworks can use or interoperate with OpenXLA components, including JAX, TensorFlow, and parts of the PyTorch ecosystem. The project is developed with participation from multiple hardware and software organizations that contribute back-end integrations, testing, or specifications for their devices. This includes Alibaba, Amazon Web Services, AMD, Anyscale, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta, NVIDIA and SiFive.[5]
Supported target devices
Governance
OpenXLA is developed as a community project with its work carried out in public repositories, discussion forums, and design meetings. Some components, such as StableHLO, began with stewardship from specific organizations and have outlined plans for more formal and distributed governance models as the project matures.
History
The project was announced in 2022 as an effort to coordinate development of ML compiler technologies across major AI companies, notably: Alibaba, Amazon Web Services, AMD, Anyscale, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta, NVIDIA and SiFive.. It consolidated the XLA compiler, introduced StableHLO as a portable operator set, and created a unified structure for additional tools. Development continues within multiple repositories under the OpenXLA umbrella. It was founded by Eugene Burmako, James Rubin, Magnus Hyttsten, Mehdi Amini, Navid Khajouei, and Thea Lamkin from Google's Machine Learning organization.[11]
See also
References
- ↑ "OpenXLA Project". Retrieved December 21, 2024.
- ↑ Woodie, Alex (2023-03-09). "OpenXLA Delivers Flexibility for ML Apps". Datanami. Retrieved 2023-12-10.
- ↑ "TensorFlow XLA: Accelerated Linear Algebra". TensorFlow Official Documentation. Retrieved 2023-12-10.
- ↑ Smith, John (2022-07-15). "Optimizing TensorFlow Models with XLA". Journal of Machine Learning Research. 23: 45–60.
- ↑ "OpenXLA is available now to accelerate and simplify machine learning". Google Open Source Blog. Retrieved 2025-11-18.
- ↑ "intel/intel-extension-for-openxla". GitHub. Retrieved December 29, 2024.
- ↑ "Accelerated JAX on Mac - Metal - Apple Developer". Retrieved December 29, 2024.
- ↑ "Developer Guide for Training with PyTorch NeuronX — AWS Neuron Documentation". awsdocs-neuron.readthedocs-hosted.com. Retrieved 29 December 2024.
- ↑ Barsoum, Emad (13 April 2022). "Supporting PyTorch on the Cerebras Wafer-Scale Engine - Cerebras". Cerebras. Retrieved 29 December 2024.
- ↑ Ltd, Graphcore. "Poplar® Software". graphcore.ai. Retrieved 29 December 2024.
- ↑ "OpenXLA is available now to accelerate and simplify machine learning". Google Open Source Blog. Retrieved 2025-11-18.