Mega Hub ICT Solutions.

Mega Hub ICT Solutions. Contact information, map and directions, contact form, opening hours, services, ratings, photos, videos and announcements from Mega Hub ICT Solutions., Computer Company, Nnamdi Azikiwe University, Ifite Awka.

OpenAI with ReactEnjoy this piece✨
22/09/2024

OpenAI with React
Enjoy this piece✨

LoRA Adapters are, to me, one of the smartest strategies used in Machine Learning in recent years! It is one of those th...
19/06/2024

LoRA Adapters are, to me, one of the smartest strategies used in Machine Learning in recent years! It is one of those things where I think, "Wait! How didn't we think about that before?"

LoRA adapters came as a very natural strategy for fine-tuning models. The idea is to realize that any matrix of model parameters in a neural network of a trained model is just a sum of the initial values and the following gradient descent updates learned on the training data mini-batches:

𝜃(trained model) = 𝜃(initial value) + gradient descent updates

From there, we can understand a fine-tuned model as a set of model parameters where we continued to aggregate the gradients further on some specialized dataset:

𝜃(fine-tuned) = 𝜃(trained model) + more gradient descent updates

When we realize that we can decompose the pretraining learning and the fine-tuning learning into those 2 terms:

𝜃(fine-tuned) = 𝜃(trained model) + ΔW

then we understand that we don't need that decomposition to happen into the same matrix; we could sum the output of 2 different matrices instead. That is the idea behind LoRA: we allocate new weight parameters that will specialize in learning the fine-tuning data, and we freeze the original weights.

As such, it is not very interesting because new matrices of model parameters would just double the required memory allocated to the model. So, the trick is to use a low-rank matrix approximation to reduce the number of operations and required memory. We introduce 2 new matrices A and B to approximate ΔW:

ΔW ~ BA

It is important to realize that, typically, the amount of training data used for fine-tuning is much smaller than the data used for pretraining. As a consequence, it is unlikely that we could even have enough data to get good statistical convergence on the full matrix ΔW. The low-rank approximation acts as a regularization technique that will help the model generalize better on unseen data.

Divine The Developer of IoT 💻✍🏾.

16/06/2024

The CEO of this Great Tech Initiative is now on LinkedIn, Please follow and share the link for your contacts and connections to follow too. God bless you as you do

12/11/2023

Happy Sunday from all of us at Mega Hub ICT Solutions.

30/10/2023

Mega Hub ICT Solutions. Is back, we are definitely going live tomorrow, no doubt.

22/10/2023

We are so sorry for not doing our live video as promised
We will definitely meet up tomorrow morning.

19/10/2023

Synthetic Transaction Analysis of a Mobile App
New Video Dropping Tomorrow
Live video
ANTICIPATE.

18/10/2023

Address

Nnamdi Azikiwe University
Ifite Awka

Alerts

Be the first to know and let us send you an email when Mega Hub ICT Solutions. posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Contact The Business

Send a message to Mega Hub ICT Solutions.:

Share