Regression Analysis in Python

Regression analysis is one of the most widely used statistical techniques in data science and machine learning. It helps us understand relationships between variables and make predictions. In this blog, we’ll walk through the basics of regression analysis using Python and its popular libraries.


🔍 What is Regression Analysis?

Regression analysis is a predictive modeling technique used to examine the relationship between a dependent (target) variable and one or more independent (predictor) variables. The most common type is linear regression, which assumes a linear relationship between variables.

For example, predicting house prices based on area, location, and number of bedrooms is a regression problem.


🛠 Libraries Required

Before we start, install the required Python libraries:

pip install numpy pandas matplotlib seaborn scikit-learn

Now, let's import them:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.linear_model import LinearRegression

from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_squared_error, r2_score


📊 Sample Example: Predicting House Prices

Let's create a simple regression model using a sample dataset:

# Sample dataset

data = pd.DataFrame({

    'Area': [1000, 1500, 2000, 2500, 3000],

    'Price': [200000, 250000, 300000, 350000, 400000]

})

# Splitting input and output

X = data[['Area']]

y = data['Price']

# Splitting into train and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Training the model

model = LinearRegression()

model.fit(X_train, y_train)

# Predicting

y_pred = model.predict(X_test)

# Evaluating

print("Mean Squared Error:", mean_squared_error(y_test, y_pred))

print("R² Score:", r2_score(y_test, y_pred))


📈 Visualizing the Regression Line

plt.scatter(X, y, color='blue')

plt.plot(X, model.predict(X), color='red')

plt.xlabel('Area')

plt.ylabel('Price')

plt.title('Linear Regression')

plt.show()


✅ Conclusion

Regression analysis in Python is straightforward and powerful. With just a few lines of code, you can build predictive models and gain insights into data. Whether you're analyzing business trends or forecasting future sales, regression is a must-have tool in any data scientist’s toolkit.

Start with simple linear regression and gradually explore advanced techniques like polynomial regression, ridge, and lasso for better accuracy and performance. 

Learn Data Science Training Course

Read More

Cleaning Messy Datasets: Best Practices

How to Work with Time Series Data

Creating Dashboards with Power BI

Understanding Data Pipelines in Data Science

Visit Quality Thought Training Institute

Get Direction

Comments

Popular posts from this blog

How to Create Your First MERN Stack App

Top 10 Projects to Build Using the MERN Stack