Generative AI & Artificial General Intelligence (AGI)

Wednesday, September 17, 2025

Linear equations in AI / machine learning

Equation in AI

In machine learning, the model often starts with a linear equation:

y = w_1x_1 + w_2x_2 + \dots + b

Inputs = features (e.g., number of rooms in a house, area in sq. ft, etc.)

Weights = importance given to each feature

Bias = baseline adjustment

Output = prediction (e.g., house price)

---

2. How Weights Are Learned

Initially, weights are set randomly (like guessing).

The model makes a prediction.

It compares prediction vs. actual answer (this difference = error/loss).

Using an algorithm like gradient descent, the model adjusts weights step by step to reduce error.

---

3. Simple Example: Predicting House Price

Equation:

Price = (w_1 \times \text{Area}) + (w_2 \times \text{Bedrooms}) + b

Suppose training data says:

A 1000 sq. ft, 2-bedroom house = $300k

A 2000 sq. ft, 3-bedroom house = $500k

The model might learn weights like:

(each sq. ft adds $150)

(each bedroom adds $20,000)

(no baseline adjustment)

So:

Price = 150 \times \text{Area} + 20{,}000 \times \text{Bedrooms}

---

4. Intuition

If is large → Area matters a lot.

If is small → Bedrooms don’t influence much.

AI keeps tweaking weights until the predictions match reality closely.

---

👉 In short:

Weights = knobs AI turns to “tune” importance of inputs.

Training = the process of finding the best knob settings.

---

Deep learning YouTube serirs

https://youtube.com/playlist?list=PLehuLRPyt1HxuYpdlW4KevYJVOSDG3DEz&si=-5j3MRmA5BfKqem6

Friday, August 29, 2025

Why is ChatGPT doing what it is doing?

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

Friday, August 22, 2025

GPUs

https://jax-ml.github.io/scaling-book/gpus/

Friday, August 08, 2025

AI Agents Memory

𝗔𝗜 𝗔𝗴𝗲𝗻𝘁’𝘀 𝗠𝗲𝗺𝗼𝗿𝘆 is the most important piece of 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴, this is how we define it

In general, the memory for an agent is something that we provide via context in the prompt passed to LLM that helps the agent to better plan and react given past interactions or data not immediately available.

It is useful to group the memory into four types:

𝟭. 𝗘𝗽𝗶𝘀𝗼𝗱𝗶𝗰 - This type of memory contains past interactions and actions performed by the agent. After an action is taken, the application controlling the agent would store the action in some kind of persistent storage so that it can be retrieved later if needed. A good example would be using a vector Database to store semantic meaning of the interactions.

𝟮. 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 - Any external information that is available to the agent and any knowledge the agent should have about itself. You can think of this as a context similar to one used in RAG applications. It can be internal knowledge only available to the agent or a grounding context to isolate part of the internet scale data for more accurate answers.

𝟯. 𝗣𝗿𝗼𝗰𝗲𝗱𝘂𝗿𝗮𝗹 - This is systemic information like the structure of the System Prompt, available tools, guardrails etc. It will usually be stored in Git, Prompt and Tool Registries.

𝟰. Occasionally, the agent application would pull information from long-term memory and store it locally if it is needed for the task at hand.

𝟱. All of the information pulled together from the long-term or stored in local memory is called short-term or working memory. Compiling all of it into a prompt will produce the prompt to be passed to the LLM and it will provide further actions to be taken by the system.

We usually label 1. - 3. as Long-Term memory and 5. as Short-Term memory.

#LLM #AI #ContextEngineering