InfiX.ai

Welcome to InfiX.ai! We believe our research will eventually lead to decentralized Generative AI—a future where everyone can access, contribute to, and benefit from AI equally.
Our Mission: Generative AI for all, intelligence in every task.

🤖 Our Model Series

🔗 Model Fusion & Model Merging

Model fusion refers to the process of combining multiple trained models—often from different domains, architectures, or training datasets—into a single, more powerful model. The goal is to integrate their strengths and knowledge, improving performance, generalization, or efficiency.
Model merging is a specific type of model fusion that involves combining the internal parameters (typically weights) of two or more pretrained models to produce a single model that inherits knowledge from all sources. Unlike ensemble methods, model merging produces a single merged model rather than relying on multiple models at inference time.

InfiFusion: InfiFusion is a logit-level fusion pipeline based on Universal Logit Distillation, enhanced with Top-K filtering and logits standardization. It supports both pairwise and unified fusion strategies to balance performance and efficiency.
InfiGFusion: InfiGFusion is a structure-aware extension that builds co-activation graphs from logits and aligns them via an efficient Gromov-Wasserstein loss approximation, capturing cross-dimension semantic dependencies for stronger reasoning.
InfiFPO: InfiFPO is a lightweight fusion method during the preference alignment phase that injects fused model behavior into preference learning, enabling richer signal during DPO-style fine-tuning.

🧠 Reasoning-Enhanced Low-Resource Training Pipeline

InfiR: InfiR aims to advance AI systems by improving reasoning, reducing adoption barriers, and addressing privacy concerns through smaller model sizes.
InfiR-FP8: InfiR-FP8 is a smaller reasoning-enhanced model trained from scratch using FP8 precision, achieving successful convergence while reducing memory usage by 10% and improving training speed by 20% during the training process. The model will be released in mid-September.
InfiAlign: InfiAlign is a scalable and data-efficient post-training framework that combines supervised fine-tuning (SFT) and reinforcement learning (RL) with a high-quality data selection pipeline to enhance reasoning in large language models.
InfiMMR: InfiMMR is a novel three-phase curriculum framework that systematically enhances multimodal reasoning capabilities in small language models through foundational reasoning activation, cross-modal adaptation, and multimodal reasoning enhancement.

🖥️ Advanced Vision-Native Agent for GUI Interaction

InfiGUIAgent: InfiGUIAgent is a GUI agent that embeds native hierarchical and expectation-reflection reasoning through a unique two-stage supervised pipeline, enabling robust, multi-step GUI task automation.
InfiGUI-R1: InfiGUI-R1 is a GUI agent developed via the Actor2Reasoner framework, which evolves a reactive model into a deliberative reasoner capable of sophisticated planning and error recovery through spatial reasoning distillation and reinforcement learning.
InfiGUI-G1: InfiGUI-G1 is a multimodal GUI agent that employs Adaptive Exploration Policy Optimization (AEPO) to improve semantic alignment in GUI grounding. The novel training framework achieves up to 8.3% relative improvement over baseline methods.

📰 News

🔥[2025/8/11] Our paper "InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization " released. More information can be found in the repository. Model is available here
🔥[2025/5/20] Our paper "InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion " released. More information can be found in the repository. Model is available here
🔥[2025/5/20] Our paper "InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models " released. More information can be found in the repository. Model is available here
🔥[2025/4/19] Our paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners" released. More information can be found in the repository.
🔥[2025/1/9] Our paper "InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection" released.