LoRA vs Fine-Tuned AI Models. My user choices.

In my previous article, AI and Creativity, I explored the potential of AI to handle tasks that require creativity, originality, and diversity. The main conclusion of that research was that a base AI model, no matter how advanced, is inherently incapable of true creativity. To truly emulate creative expression, AI requires fine-tuning that captures the unique perspective of an individual (author), as creativity is deeply tied to personal experience and worldview.

Creativity don’t have structured rules, documentation, or widely accepted best practices. There’s no concept of normality or a right answer for artworks.
In creativity works, the author’s inner worldview is reflected. This worldview is a perception, created and shaped by a cocktail of numerous events that have occurred in their life. A work of art is an imprint of this perception at the moment of the time when the work was created.

A large foundation AI model, by its nature, is essentially an average product of all the data it was trained on. As a result, both creativity and diversity are flattened into a single mean line. And this is something everyone can notice, and many people regularly arise the problem of the lack of originality in AI-generated content on social media:

Of course, if you are not creative person yourself, it might seem base AI models like ChatGPT, Grok is highly inventive, because your creativity is below the average line.
However, if you are a creative person, the outputs of a base AI model will hinder your ideas rather than inspire them.

Demand for custom AI models for individual users is growing rapidly. Some notable startups in this field include Delphi, Character AI, Replika, and Personal AI. Soon, everyone will have multiple custom AI models tailored to their specific needs, from creative writing to professional tasks. For in-depth insights, I recommend the following resources:

At Stroke, we build the platform that allows anyone to train and collaborate with their own AI twin for any task, creative or not. With such AI twin, user would recognize himself in the outputs. For creators, this will dramatically accelerates work speed, much like AI tools speed up software development multiply times.

Today, I aim to explore the most efficient technical way to create AI twins on Stroke. To answer this i will outline my own user preferences, describe them as an ordinary user (i’m writing a lot), and then analyze, from the perspective of Stroke’s CTO, which solution best fit to these preferences.

My user preferences

As a writer who creates extensively, I want AI twins that feel like an extension of my creative self. Here’s what I need:

Cost of training. Training multiple models should be affordable and cost-efficient for me. I should be able to experiment with models freely without high-costs.
Ease to start. I may not have large datasets of my content. The training process should work even if I provide a small amount of data. A few articles, notes, or drafts — the system should be smart enough to adapt from there.
Training speed. Waiting days or many hours would be a disaster for me. My AI twin should be ready in an hour or minutes, so I can start using it without delays.
Ownership. I want to own my AI twin – both technically and legally. I don’t want it to be public or used by anyone without my direct consent. Also, option to storage under my control.
Size. Since I want to store my AI twins locally, model size matters.
Cross-platform. It must work seamlessly across devices, including phone. Creative ideas often comes on the go, and it would be great to discuss and record them with my AI on the fly.
Inference speed. Working with the model should feel smooth and responsive, even when running locally.
Continuous learning. The model should adapt to my changing style and worldview through continuous learning.

In overall, as an ordinary user, I just want a cheap, fast, secure,lightweight, and user-friendly solution.

Efficient ways to create custom AI models

Currently, there are several approaches to creating custom AI models:

Fine-tuning. Fully or partially retraining the base model, changing all or most of its parameters.
Adapter-based methods. Injecting small additional layers into the base model without changing its core weights. The most popular method is LoRA (Low-Rank Adaptation).
Prompt-based engineering with ICL (In-Context Learning). Using carefully designed prompts and contextual examples to guide the model within the model’s context window.

Before comparing these approaches against my user preferences, it’s important to identify which methods provide enough AI customization for the model to feel like my true AI twin. It should be capable of mirroring my creative style and reasoning logic and make it concise and natural. This is my core goal as a user, a prerequisite before any of my personal preferences.

With full / partial fine-tuning, we are adjusting the weights of a base model on a new extensive dataset. As a result all or most of its weights are going to be updated. These changes enable deep adaptations and could completely transform the AI model.

Usually this fine-tuning technique applied when the domain or end-goal of the model needs to be changed significantly. For instance, if we want to create a medical research model, legal services model, and so on.

Another case is when we want to modify the AI model’s reasoning logic. For example, we would like to create a scientific assistant that strictly follows algorithms to prove any hypotheses or statements before using them.

This fine-tuning process always requires a large amount of domain-specific data, along with careful fractional injection during the retraining. But from a personalization standpoint, fine-tuning offers great potential to create a true personal AI.

Apart from fine-tuning, there are other approaches to retrain the model, such as LoRA. The LoRA method involves adding small additional layers to the base model and training parameters only with these new layers.

The concept of LoRA was inspired by observations of how matrix weights change during training, as highlighted in the work of Aghajanyan et al. (2020). The name LoRA was introduced by a team of Microsoft researchers in LoRA: Low-Rank Adaptation of Large Language Models work.

This approach enables faster learning because only a small number (usually 1-5%) of weights are trained. Moreover, observations show that the LoRA method preserves most of its essential information and structure.

To get more technical information about the LoRA i recommend these resources:

The question is: how much personalization can we achieve with LoRA adapters? Can such an AI truly be called a AI twin of a human? Research LoRA: Low-Rank Adaptation of Large Language Models (Hu et al., 2021) shows that LoRA personalization performance comparable to full fine-tuning. For example, on the GLUE dataset, LoRA reached an average accuracy of approximately 87.8%, while full fine-tuning model achieved 88.9%, training less than 1% of the parameters.

This demonstrates that LoRA can effectively personalize AI models with minimal cost.

The third method is Prompt-based tuning and ICL. This approach relies on carefully designed prompts, allowing the model to temporarily adapt to examples or instructions within the session and context window. It’s 100% useful for short-term guidance and chatting experience with AI. But it’s challenging to consider prompt-based tuning as a true AI twin.

One of the most obvious disadvantages is maximum content window size. When the limit is reached, AI will start truncating incoming data and losing track of the conversation and my identity.

As shown, during a long conversation with AI, the context window can fill up rather quickly, since the input layer includes the entire conversation history every time.

Thus, the described method cannot replicate the deep alignment with my worldview and creative style required for my AI twin.

Comparison and my user choice

As a result, to create an AI twin – fine-tuning and LoRA adapters are viable options. Lets compare this two methods against my user preferences.

Aspect	Fine-Tuning	LoRA
Cost of training	🟥 $500–$5000 (multi-GPU)	🟩 $10–$100 (single GPU)
Ease to Start	🟥 Requires extensive datasets	🟩 Works with small datasets
Training Speed	🟥 Days to weeks (e.g., 7B model)	🟩 Hours to minutes (e.g., 7B model)
Ownership	🟩 Full control	🟩 Full control
Size	🟥 Large (e.g. 10–300 GB)	🟩 Small adapters (e.g., 10–100 MB)
Cross-Platform	🟨 Deployable but resource-heavy	🟩 Lightweight adapters
Inference Speed	🟨 Fast, but resource-intensive	🟩 Fast, not resource-intensive
Continuous Learning	🟨 Possible but slow/costly to retrain fully	🟩 Efficient for frequent updates

As shown in the table, LoRA outperforms full fine-tuning in most of my user preferences. Since both fine-tuning and LoRA adapters meet the core need of creating an AI twin, LoRA is more preferable for me due to its efficiency.

I think it would be ideal if Stroke provides a library of pre-fine-tuned models designed for specific creative tasks, like writing books, blog content creation, making scientific research, and so on. I’ll be able to choose a pre-trained model and adapt it to my unique perception and style using LoRA. I think this workflow would meet all my needs and make me a happy user of Stroke.