Activation and Norms

Activation / Non-linear function A non-linear function is one in which cannot be expressed from y = mx + c lets suppose a function with y = x * W1 * W2 .. Wl Linearity with respect to x, this means rest of the variable are constant and only x is changing , so linearity with respect to x is linear Linearity with respect to Wi, this means rest of the variable (x here) is constant and only weights are changing , so linearity with respect to W is non-linear ...

September 8, 2024 · 6 min · Mohit Dulani

Statistics for ML

Deep dive into the whole required stastistics , that would be a required to learn stable diffusion from the very scratch Probability and Distributions Probability : It tells the chances of an event to happen. And is calculated as the ratio of no. of outcomes of a particular event to the total no. of outcomes. Likelihood : It measures how well a statistical model fits the observed data. QUESTION ? Q) What is a statistical model and observed data ? ...

September 8, 2024 · 13 min · Mohit Dulani

Learning a bit advanced Pytorch

Training Loop demystified Forward Pass: Compute predictions (The code where you pass the input to the model to get the output) Loss Calculation: Compute the loss (The code where you calculate the loss between the predicted output and the target output) Backward Pass: Compute the gradients for all the parameters where we have requires_grad = True and is stored in the .grad attribute of the parameter (The code where you calculate the gradients of the loss with respect to the model parameters). In the loss.backward() function, it does not update the .grad attribute of the parameter. It just computes the gradient and stores it in the .grad attribute of the parameter. Parameter Update aka optimization: Update the parameters of the model for which we have .grad attribute / requires grad attributes (The code where you update the parameters using the gradients and a learning rate) Zero_grad(): The default behavior of the .grad attribute is to accumulate the gradients of the loss with respect to the parameters (The .grad is a tensor and it accumulates the grad for all the parameters). So we need to set it to zero before doing the backward pass. Training Loop in Pytorch for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): optimizer.zero_grad() # Forward pass outputs = model(images) loss = criterion(outputs, labels) # Backward pass (calculate the gradients and stored in .grad attribute of the parameter) loss.backward() # Updates these parameters based on the defined optimizer and learning rate optimizer.step() The alias codes and removing the abstraction loss.backward() translates to ...

September 7, 2024 · 7 min · Mohit Dulani

Learning Docker

Docker Commands end to end docker run - Run a container from an image docker start - Start a container that has been stopped docker stop - Stop a running container docker kill - Kill a running container docker rm - Remove a container Dockerfile CMD ["/app/start.sh"] : This CMD command runs as soon as you run the container ( not at the build time ) , only at the run time this command works and easily gets overwritten by something like this docker run -it <image_name> ./bin/bash now bin/bash will run as the entry point !! `` : ...

September 7, 2024 · 2 min · Mohit Dulani

Generative AI explained

Machine learning aka Mathematical modelling We start with an equation … a tough equation and then we say the model to fit on the data and find the correct parameters for it .. The more diverse the data the more better it learns and works well in the actual env. Lets take a deep dive in this word called “Generative” Artificial Intelligence.

June 10, 2024 · 1 min · Mohit Dulani

Intuition / Thinking point for AI explained

KL-Divergence flow-matching Generative models are very good / best in class approximators of a complex probabilistic equation/distribution that we most of the times have no idea of !! Images modality : All images in the world are from a very complex distribution of pixels that gives direction , based on the prompts , and are dependent of what the user wants , and as it very complex to model that distribution we rely on NN to predict it , hence we have diffusion models ...

June 10, 2024 · 3 min · Mohit Dulani

Diffusion and flow matching

The pre-requisites for this blog post is Journey till diffusion model Stable Diffusion model: Uses cross-attention for allowing conditional modelling ( using text/segmentation map + image to generate image) Mode collapse doesnt happen in likelihood based model and SD is a likelihood based model High frequency details : it means the details / detail-oriented view Related work: (previous work in this field) , same vq-vae and vq-gna What is the Inductive bias of the DM’s inherited by the UNET model ? ...

June 9, 2024 · 11 min · Mohit Dulani

Journey till diffusion model

Diffusion model’s were not built in a day, so we will learn the underlying concepts that helped us reach to diffusion model. Autoencoders These are the most basic type, its used for reconstruction of data, for example, if you are given an image, you can train an autoencoder to reconstruct the image. These can be used for dimensionality reduction, feature extraction and data compression. Variational Autoencoders They are first in the category to produce new data points from the existing dataset by learning the latent space of the data. We produce 2 vectors from the encoder namely, mean and log variance and we sample a vector from a gaussian distribution with mean and variance as parameters. This latent vector is then given to the decoder to reconstruct the image of the same size as the original image. ...

March 3, 2024 · 3 min · Mohit Dulani

Learning NMAP

Learning Nmap Basics: Ping scan : Pings all devices on a subnet network => nmap -sp 192.168.1.0/24, which sends all devices ICMP echo request (Internet Control Message Protocol Echo, its a request to check ‘Are you there’) , tools like ping sends these requests Single host scan for Ports : examine a single host for 1000 most common ports .. ( 80 : https , 22 : ssh ) …. nmap 192.168.1.100 // nmap scanme.nmap.org ...

February 29, 2024 · 2 min · Mohit Dulani

Sniffing packets and testing those

Wireless Network Adapter aka wifi card that is used to connect to wifi’s WEP : oldest WPA : Any certificate can lead to leak WPA2 : kick a user, then he reconnects, capture the certificate, 4-way handshake ,capture the certificate in betweeen WPA3 : all password attempts need to be on internet Monitor mode In this mode it becomes a radio sniffer, listens to all wireless signals in the air on a specific channel. Hear’s everything happening in the room ...

June 21, 2021 · 4 min · Mohit Dulani