AIOZ AI, Singapore (2026)

12/07/2025

📖 Deep Quantization 📘

📢 Part 3: Introducing PyTorch Quantization: Slimmer, Speedier Models for CPU & Mobile!

🔧 Two Modes to Fit Your Workflow
Eager Mode: Manually fuse layers (Conv+BN+ReLU) and insert quant stubs—straightforward for supported torch.nn modules.
FX Graph Mode: Automatic graph rewriting for wider model support—just tweak your model once and let FX do the rest.

🎛️ Three Quantization Algorithms
1. Dynamic Quantization
▪️ Weights are quantized at load time; activations on-the-fly.
▪️ Ideal for transformers, LSTMs—drop-in speed boost with minimal fuss.
2. Static (Post-Training) Quantization
▪️ Calibrate both weights & activations ahead of inference.
▪️ Leverages FBGEMM on x86 or QNNPACK on ARM—best when deployment and training hardware match.
3. Quantization-Aware Training (QAT)
▪️ Simulates int8 effects during training, then fine-tunes to recover precision.
▪️ Yields the highest post-quant accuracy for vision and speech nets.

🚀 3-Step Workflow
1️⃣ Prepare Your Model
▪️ Fuse adjacent layers for a single-pass compute (e.g., Conv+BN+ReLU).
▪️ Wrap with QuantStub/DeQuantStub if you only want to quantize specific submodules.
2️⃣ Configure & Quantize
▪️ Pick your algorithm (Dynamic, Static, or QAT).
▪️ Supply a small representative dataset for range calibration (Static/QAT).
3️⃣ Validate on CPU
▪️ Run inference through the PyTorch CPU backend (mobile too!).
▪️ Compare accuracy against your float32 baseline—expect a tiny drop (often

05/07/2025

📖 Deep Quantization

📢 Part 2: TensorFlow Lite Quantization in 3 Easy Steps!

1️⃣ Export to .tflite
Convert your trained Keras or SavedModel into TensorFlow Lite’s FlatBuffer format for lightning-fast loading on any device.

2️⃣ Apply INT8 Quantization
Enable default optimizations and feed a small representative sample of your data so TFLite can calibrate weight and activation ranges—shrinking model size by ~4× without retraining.

3️⃣ Validate with TFLite Interpreter
Load the quantized file, rescale inputs using the built-in scale & zero-point, run inference on your test set, and compare against ground truth—expect a tiny accuracy dip (

28/06/2025

📖 Deep Quantization

🚀 Part 1: Unlock Lightning-Fast AI on Tiny Devices with Quantization! 🤖💨

Deep learning models are getting insanely accurate—but their size often makes them too bulky for IoT gadgets and basic smartphones. Enter quantization, the secret sauce that slashes model size, cuts latency, and turbocharges inference speed—all with minimal hit to accuracy. 🔍✨

🎯 What’s Quantization?
It’s all about storing and computing tensors in lower-bit data types instead of full 32-bit floats. Since deep nets are just giant matrix multiplications, fewer bits means smaller files, fewer memory reads, and faster math—without “dropping the ball” on precision.

✨ Two Flavors to Try:
1. Post-Training Quantization 🛠️
▪️ Train your model in full-precision, then freeze & quantize.
▪️ Super simple to apply, but any quantization error can’t be “learned away.”
▪️ 👉 Dynamic Range: 4× smaller, 2–3× speedup on CPU
▪️ 👉 Full Integer: 4× smaller, 3× speedup across CPU/TPU/Microcontrollers
▪️ 👉 Float16: 2× smaller, GPU-friendly boosts
2. Quantization-Aware Training 🎓
▪️ Inject quantization into the forward pass during training.
▪️ Model “learns” to compensate for low-bit errors, giving you much higher post-quant accuracy.

🔗 Dive into today’s article for an in-depth guide, complete with figures on size-reduction and error-compensation workflows. We’ll get you up to speed on the math behind quantization and compare methods🌟👇

📚 Blog Part 1: https://blog.ai.aioz.io/guides/ml-ops/Quantization%20in%20Deep%20learning_13/

21/06/2025

📖 Foundation Model and Federated Learning 📘

📢 Part2: Introducing FedFoundry: The Future of Foundation Models Meets Federated Learning!

🌐 What Is FedFoundry?
FedFoundry integrates massive pre‑trained “foundation” architectures (e.g., BERT, GPT, CLIP, Stable Diffusion) with federated learning—either centralized or decentralized—so that clients collaboratively fine‑tune these versatile backbones without ever sharing raw data.

🚀 Why FedFoundry Over Traditional FL or Solo Foundation Models?
✨ 🔒 Data Privacy & Security
▪️ Raw data stays on‑device or in‑silo—only encrypted weight updates or small adapter modules traverse the network.
▪️ Complies with stringent regulations (HIPAA, GDPR) by design, minimizing leak risks.
✨ ⚙️ Compute & Communication Efficiency
▪️ Heavy, self‑supervised pre‑training is performed once; clients share and update light‑weight task‑specific parameters (LoRA, prompt vectors, bias‑only).
▪️ Dramatically lowers bandwidth usage compared to full‑model exchanges.
✨ 📈 Rapid Convergence & Multi‑Modal Performance
▪️ Rich, generalizable representations from foundation models accelerate fine‑tuning on local tasks (prompt engineering, in‑context learning).
▪️ Handles diverse data types—text, images, audio—within a unified framework.
✨ 🎯 Tackling Heterogeneity & Personalization
▪️ Non‑IID Data: Techniques like Federated Zero‑Shot Learning with CLIP embeddings align disjoint label spaces across clients.
▪️ Meta‑Learning: Hierarchical peer‑to‑peer frameworks (PPFM) adapt quickly to each client’s unique distribution via node clustering and multi‑stage meta‑training.
▪️ Fairness & Robustness: Zero‑shot data augmentation balances under‑represented classes, reducing accuracy variance across clients.

📚 Blog Part 2: https://blog.ai.aioz.io/guides/decentralized-ai/FoundationModelling_21/

14/06/2025

📖 Foundation Model and Federated Learning

📢 Part 1: Introducing Federated Foundation Models (FFMs): Harnessing the Power of Pre‑Trained Giants Across Silos!

🌐 What Are FFMs?
Federated Foundation Models unite massive pre‑trained architectures (e.g., BERT, GPT, DALL·E) with federated learning, enabling model improvement on distributed data without moving sensitive user or institutional datasets.

🚀 Why Combine Foundation Models with Federated Learning?
➡️ 🔒 Data Privacy & Security
▪️ Raw data never leaves the client device or silo.
▪️ Only encrypted model updates are exchanged, preserving confidentiality.
➡️ ⚙️ Compute & Communication Efficiency
▪️ Heavy pre‑training done once; clients share lighter fine‑tuning workloads.
▪️ Parameter‑efficient tuning (e.g., LoRA adapters, bias‑only updates) slashes upload/download sizes.
➡️ 📈 Rapid Convergence & Performance
▪️ Foundation models’ rich representations accelerate federated adaptation.
▪️ Fewer communication rounds needed to achieve high accuracy on downstream tasks.
➡️ 🎛️ Multimodal Adaptability & Personalization
▪️ Single backbone handles text, vision, audio, and more across clients.
▪️ Client‑specific adapters enable personalized behavior without retraining entire model.

📚 Blog Part 1: https://blog.ai.aioz.io/guides/decentralized-ai/FoundationModelling_20/

07/06/2025

📖 Decentralized Federated Learning and Research Directions 📘

📢 Part 2: Introducing Topology Design in Decentralized Federated Learning (DFL): Building the Backbone of Collaborative AI!

🌐 What Is Topology Design?
Topology design determines who communicates with whom in a DFL network. By shaping the peer‑to‑peer overlay, it directly influences convergence speed, communication overhead, and trust distribution—key factors for robust, efficient decentralized training.

🚀 Why Topology Matters?
✨ 🔄 Throughput‑Optimal Overlays
▪️ Formulate the MCT problem: given a connectivity graph, find a directed subgraph that minimizes cycle time (communication rounds per unit time).
▪️ Undirected case: use a minimum‑weight spanning tree (Prim’s algorithm).
▪️ Directed Euclidean graphs: apply Christofides’ approximation for NP‑hard scenarios.
✨ 🕸️ Multigraph‑Based Adaptive Isolation
▪️ Build a multigraph overlay and decompose it into simple graphs each round.
▪️ Isolate high‑delay nodes temporarily to prevent bottlenecks.
▪️ Reintegrate them later to balance load and accelerate overall training.
✨ 🎯 Flexibility & Customization
▪️ Dynamic rewiring lets you adapt connections on‑the‑fly based on node performance and data drift.
▪️ Topology variants (fully connected, partial meshes, clustered) can be mixed to balance communication cost versus convergence speed.

📚 Blog Part 2: https://blog.ai.aioz.io/guides/decentralized-ai/DecentralizedFederatedLearning_19/

31/05/2025

📖 Decentralized Federated Learning and Research Directions

📢 Part 1: Introducing Decentralized Federated Learning (DFL): The Next Frontier in Collaborative AI!

🌐 What Is DFL?
Decentralized Federated Learning (DFL) removes the central server and lets clients (devices or organizational silos) share model updates directly, preserving privacy and reducing trust bottlenecks.

🚀 Why DFL Over Centralized FL?
➡️ 🔒 Enhanced Privacy & Trust
▪️ No single server means no single point of attack.
▪️ Trust is distributed across all participating nodes.
➡️ ⚙️ Fault Tolerance & Robustness
▪️ Nodes continually detect and adapt to peers dropping out.
▪️ Eliminates risks from server failure or network partition.
➡️ 📈 Balanced Communication Load
▪️ Peer‑to‑peer exchanges avoid server congestion.
▪️ Adaptive connection patterns save bandwidth and reduce latency.
➡️ 🎛️ Flexible Topology & Customization
▪️ Dynamic node networks tailored to specific scenarios.
▪️ Optimized compute and communication per deployment.

📚 Blog Part 1: https://blog.ai.aioz.io/guides/decentralized-ai/DecentralizedFederatedLearning_18/

24/05/2025

📖 FedEFM: Federated Endovascular Foundation Model with Unseen Data (ICRA 2025) 📘

[Conference website: https://2025.ieee-icra.org/]

📢 Part 4: FedEFM: Federated Endovascular Foundation Model Evaluation

🔬 Ensuring high-performance AI models for endovascular interventions while maintaining data privacy is a critical challenge. Traditional methods struggle with unseen data, resulting in decreased accuracy and poor generalization.

🚀 FedEFM revolutionizes this space with federated learning!
▪️ 📈 State-of-the-art validation across Centralized Local Learning (CLL), Client-server Federated Learning (CFL), and Decentralized Federated Learning (DFL), outperforming existing methods with up to 98.2% accuracy.
▪️ 🔍 Robust fine-tuning for endovascular classification & segmentation tasks, leveraging foundation models like CLIP, SAM, and LVM-Med.
▪️ ⚖️ Unseen data resilience—FedEFM maintains 84.9% accuracy in extreme cases where all data labels are unseen.
▪️ 🤖 Backbone adaptability—integrates with UNet, TransUNet, SwinUNet, and ViT to improve segmentation and classification tasks.

FedEFM demonstrates significant potential for federated learning in medical AI, setting new standards in privacy-preserving endovascular intervention models. Challenges remain in hardware variability and training time, but the approach paves the way for future applications, including robotic-assisted surgery and pathology.

📜 Paper Link: https://arxiv.org/pdf/2501.16992
🔗 Github Page: https://aioz-ai.github.io/FedEFM/
⚙️ Source Code: https://github.com/aioz-ai/FedEFM
📚 Blog Part 4: https://ai.aioz.io/research/FedEFM4/

17/05/2025

📖 FedEFM: Federated Endovascular Foundation Model with Unseen Data (ICRA 2025)

[Conference website: https://2025.ieee-icra.org/]

📢 Part 3: FedEFM - Training the Federated Endovascular Foundation Model!

In this part, we dive into the methodology behind training FedEFM and integrating its weights across multiple downstream tasks!

🌐 Federated Distillation for Unseen Data
Addressing the challenge of unseen data, FedEFM leverages a decentralized federated learning approach with federated knowledge distillation.
▪️ 🏥 Each hospital trains a local model using its own data.
▪️ 🔗 Weights are transferred to neighboring silos to learn from their data.
▪️ 📈 Earth Mover’s Distance (EMD) measures similarity and optimizes weight fusion.
▪️ ⚖️ A loss function balances local training and knowledge from other silos.

📊 Differentiable Earth Mover’s Distance (EMD)
FedEFM introduces EMD-based optimization for effective knowledge transfer:
▪️ We compare feature representations between local and transferred models.
▪️ The optimal matching flow minimizes the distance for knowledge adaptation.
▪️ The gradients are computed efficiently for backpropagation without modifying the optimization path.

🔄 Training & Aggregation
Each silo optimizes its local model using knowledge distilled from transferred weights:
▪️ 🎨 Softened outputs from teachers and students ensure smooth learning.
▪️ ⚛️ EMD-weighted objective function maintains convergence.
▪️ 🌟 Final aggregated model integrates the knowledge from all participating silos!

🔬 What's Next?
We will validate FedEFM’s effectiveness on our Endovascular Intervention Dataset. Stay tuned for groundbreaking results in medical AI!

📜 Paper Link: https://arxiv.org/pdf/2501.16992
🔗 Github Page: https://aioz-ai.github.io/FedEFM/
⚙️ Source Code: https://github.com/aioz-ai/FedEFM
📚 Blog Part 3: https://ai.aioz.io/research/FedEFM3/

10/05/2025

📖 FedEFM: Federated Endovascular Foundation Model with Unseen Data (ICRA 2025)

[Conference website: https://2025.ieee-icra.org/]

📢 Part 2: FedEFM - A dataset for the Federated Endovascular Foundation Model

🔬 The development of a robust foundation model for endovascular intervention requires diverse and high-quality datasets. However, data collection and management remain challenging, especially in a federated learning setup where data privacy is critical.

🚀 FedEFM tackles these challenges!
🔹 🤖 Robotic Setup for Data Collection: A robotic system captures large-scale X-ray images using a master-follower control setup, generating the EIPhantom dataset with 4,700 labeled X-ray images.
🔹 🏥 Real & Simulated Data: We collect both real-world X-ray images and simulated data from the CathSim simulator, ensuring a diverse dataset for model training.
🔹 🔐 Federated Learning with Unseen Data: Hospitals possess unique datasets that are inaccessible to others. FedEFM leverages a multishot federated distillation algorithm with Earth Mover’s Distance (EMD) to enable learning across silos without sharing data.

🎯 Why It Matters?
🔹 🏗️ FedEFM is the first federated foundation model for endovascular intervention, ensuring seamless collaboration while maintaining patient data privacy.
🔹 📊 Pretrained weights from FedEFM serve as a powerful initialization for downstream medical imaging tasks.

⚡ What’s Next?
In the next part, we’ll dive into the technical details of training FedEFM! Stay tuned!

📜 Paper Link: https://arxiv.org/pdf/2501.16992
🔗 Github Page: https://aioz-ai.github.io/FedEFM/
⚙️ Source Code: https://github.com/aioz-ai/FedEFM
📚 Blog Part 2: https://ai.aioz.io/research/FedEFM2/

03/05/2025

📖 FedEFM: Federated Endovascular Foundation Model with Unseen Data (ICRA 2025)

[Conference website: https://2025.ieee-icra.org/]

📢 Part 1: Introducing FedEFM: The Future of Federated Endovascular Foundation Models!

🎨 Endovascular surgery relies on precise catheter and guidewire detection in X-ray images to ensure patient safety. However, training deep learning models for this task is challenging due to limited labeled data and strict privacy constraints.
🚀 FedEFM bridges this gap by training a foundation model using federated learning, enabling collaborative model improvement without sharing sensitive patient data.
▪️ 🎢 Federated Learning Setup: Trains across multiple hospital silos, ensuring privacy.
▪️ ✨ New Multishot Distillation Technique: Uses Earth Mover’s Distance to tackle unseen data challenges.
▪️ 🔖 Diverse Endovascular Dataset: Incorporates human, animal, phantom, and simulation X-ray images.
▪️ 🔄 Optimized for Downstream Tasks: Enhances model performance for catheter and guidewire segmentation.

FedEFM achieves state-of-the-art performance while preserving data security, setting a new benchmark for robotic-assisted endovascular surgery and medical AI research.
Stay tuned for the next part, where we dive into dataset collection and management for training!

📜 Paper Link: https://arxiv.org/pdf/2501.16992
🔗 Github Page: https://aioz-ai.github.io/FedEFM/
⚙️ Source Code: https://github.com/aioz-ai/FedEFM
📚 Blog Part 1: https://blog.ai.aioz.io/research/FedEFM1/

26/04/2025

📖 CathAction: A Benchmark for Endovascular Intervention Understanding (TMI 2024) 📘

[Journal website: https://ieeetmi.org/]

🔬 Part 3: Endovascular interventions require precision and adaptability, but existing datasets are limited in scope and scalability.

🚀 CathAction revolutionizes the field by offering the most extensive dataset for endovascular intervention research, covering five critical tasks: anticipation, recognition, segmentation, collision detection, and domain adaptation.
Key Insights from CathAction benchmarking:
🔹 ⏳ Anticipation: Transformer-based models like AFFT lead with 37.91% accuracy but face difficulties in occluded or fast-action scenarios.
🔹 🛠️ Recognition: TDN-ResNet101 achieves 62.5% accuracy, struggling with the subtlety of visually similar catheter actions.
🔹 🎨 Segmentation: SegViT demonstrates the power of transformer backbones for accurate catheter and guidewire segmentation.
🔹 ⚠️ Collision Detection: Tiny detectors like EFF perform better (mAP: 14.88%), but class imbalance and object size hinder overall accuracy.
🔹 🌐 Domain Adaptation: Significant performance drops from phantom to real animal data highlight the need for robust cross-domain solutions.
🚀 CathAction is a challenging dataset and is willing to receive contributions of research approaches that can solve and thus improve the process of endovascular intervention. The video below illustrates the challenge of understanding expert actions under endovascular surgery.

🔗 Link: https://arxiv.org/pdf/2408.13126
⚙️ Github Page: https://airvlab.github.io/cathaction/
📚 Blog Part 3: https://blog.ai.aioz.io/research/CathAction3/

AIOZ AI

12/07/2025

05/07/2025

28/06/2025

21/06/2025

14/06/2025

07/06/2025

31/05/2025

24/05/2025

17/05/2025

10/05/2025

03/05/2025

26/04/2025

Address

Website

Alerts

Contact The Business

Shortcuts

Share

Category