chaptor-2 Data Analysis Using Python in Google Colab

 

Data Analysis Using Python in Google Colab

What is Google Colab?

Google Colaboratory is a free online platform where you can write and run Python code directly in your browser without installing software.

Features

  • Free Python environment
  • Cloud storage support
  • GPU/TPU support
  • Easy sharing
  • Best for Data Analysis & Machine Learning

Step 1: Open Google Colab

Open: Google Colab

Click:

  • New Notebook

Step 2: Import Required Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Step 3: Create Sample Dataset

data = {
'Student': ['Amit', 'Rahul', 'Sneha', 'Pooja', 'Karan'],
'Marks': [85, 72, 90, 67, 78],
'City': ['Nagpur', 'Pune', 'Mumbai', 'Delhi', 'Nagpur']
}

df = pd.DataFrame(data)

print(df)

Output

StudentMarksCity
Amit85Nagpur
Rahul72Pune
Sneha90Mumbai
Pooja67Delhi
Karan78Nagpur

Step 4: Basic Data Analysis

Display First Rows

print(df.head())

Dataset Information

print(df.info())

Statistical Summary

print(df.describe())

Step 5: Filter Data

Students with Marks Greater Than 75

high_marks = df[df['Marks'] > 75]

print(high_marks)

Output

StudentMarksCity
Amit85Nagpur
Sneha90Mumbai
Karan78Nagpur

Step 6: Calculate Average Marks

average = df['Marks'].mean()

print("Average Marks:", average)

Step 7: Group By Example

City Wise Average Marks

city_avg = df.groupby('City')['Marks'].mean()

print(city_avg)

Step 8: Data Visualization

Bar Chart

plt.bar(df['Student'], df['Marks'])

plt.title("Student Marks Analysis")
plt.xlabel("Students")
plt.ylabel("Marks")

plt.show()

Step 9: Upload Excel/CSV File in Colab

from google.colab import files

uploaded = files.upload()

After upload:

df = pd.read_csv('filename.csv')

print(df.head())

Step 10: Real-Time Project Ideas

Student Result Analysis

  • Highest Marks
  • Lowest Marks
  • Average Result
  • Subject-wise Analysis

Sales Data Analysis

  • Monthly Sales
  • Profit Analysis
  • Product Performance

Employee Analysis

  • Salary Analysis
  • Department-wise Report
  • Attendance Report

Popular Python Libraries for Data Analysis

LibraryUse
pandasData handling
numpyNumerical operations
matplotlibCharts
plotlyInteractive charts
sklearnMachine learning

Simple Interview Questions

What is Pandas?

Pandas is a Python library used for data analysis and data manipulation.

What is DataFrame?

A DataFrame is a table-like structure in pandas.

What is Google Colab?

Google Colab is an online Python notebook platform.


Mini Practice Task

Create a dataset of:

  • Employee Name
  • Salary
  • Department

Then perform:

  1. Average Salary
  2. Highest Salary
  3. Department-wise grouping
  4. Bar chart visualization

Useful Links

Comments

Popular posts from this blog

STORE PROCEDURE BASIC PROGRAMMING - PART 1

SQL SERVER HAVING CLAUSE QUERY

Python Introduction