INFORMATICS PRACTICES (065)
PYTHON PANDAS - I (Part - 2)
CREATING AND DISPLAYING DATAFRAME
A DataFrame object can be created by passing data in
2-dimensional format.
You can create a DataFrame object by passing data in
many different ways, such as:
Two-dimensional dictionaries i.e., dictionaries having
list or dictionaries or ndarrays or Series object etc.
- Two-dimensional ndarrays (Numpy array)
- Series type object
- Another DataFrame object
Dictionary Method
import pandas as
pd
dict1 =
{'Students':['Raj', 'Shashikant', 'Pranav', 'Tanya', 'Hinansih'],
'Marks':[82.5, 88.4, 79.3, 92.5,
89.6],
'Sports':['Football','Athletics','Cricket','Chess','Carrom']}
df =
pd.DataFrame(dict1)
Output:
Students
Marks Sports
0 Raj
82.5 Football
1 Shashikant
88.4 Athletics
2 Pranav
79.3 Cricket
3 Tanya
92.5 Chess
4 Hinansih
89.6 Carrom
#If you add Indexes as ['One', 'Two', 'Three', 'Four',
'Five']
df =
pd.DataFrame(dict1, index =['One', 'Two', 'Three', 'Four', 'Five'])
Output:
Students Marks
Sports
One Raj 82.5
Football
Two Shashikant 88.4
Athletics
Three Pranav
79.3 Cricket
Four Tanya 92.5
Chess
Five Hinansih 89.6
Carrom
Example1: Given a
dictionary that stores the section names list as value for Section key and
contribution amounts list as value for Contri key.
dict1 = {'Section':['A', 'B','C','D','E'],
'Contri': [8500,7500, 8200, 5800, 7854]}
Write code to create and display the data frame using
above dictionary.
Solution:
import pandas as
pd
'Contri': [8500,7500, 8200, 5800,
7854]}
df1 = pd.DataFrame(dict1)
print(df1)
Output:
Section
Contri
0 A
8500
1 B
7500
2 C
8200
3 D
5800
4 E
7854
Creating a dataframe from a 2D dictionary having
values as dictionary objects:
A 2D dictionary can have values as dictionary objects
too. You can also create a dataframe object using such 2D dictionary object,
e.g.
import pandas as
pd
'Marketing':{'Name':'Muskan',
'Age':21, 'Gender':'Female'}}
print(df1)
Output:
Sales Marketing
Name Navya
Muskan
Age 20 21
Gender Female
Female
Example2: Create and
display a DataFrame from a 2D dictionary, Sales, which stores the quarter-wise
sales as inner dictionary for two years, as shown below:
Sales = {'Year-2020':{'Qtr1':25000, 'Qtr2':55000,
'Qtr3':50000, 'Qtr4':36500},
'Year-2021':{'Qtr1':30000, 'Qtr2':45000, 'Qtr3':48000, 'Qtr4':41500}}
Solution:
import pandas as
pd
'Year-2021':{'Qtr1':30000,
'Qtr2':45000, 'Qtr3':48000, 'Qtr4':41500}}
print(df)
Output:
Year-2020 Year-2021
Qtr1 25000 30000
Qtr2 55000 45000
Qtr3 50000 48000
Qtr4 36500 41500
Creating DataFrame Object from a List of Dictionary/Lists
If you pass a 2D list having dictionaries as its
elements (list of dictionaries) to pandas.DataFrame( ) functions, it will
create a DataFrame object such that the inner dictionary keys will become to
the columns and inner dictionariy’s values will make rows.
Example3: Write a
program to create a dataframe from a 2D list. Specify own index labels.
Solution: import pandas as
pd list1 =
[[30,40,50,60],[45,78,58,74],[57,69,85,74],[59,65,48,52]] df1 =
pd.DataFrame(list1, index = ['R1', 'R2', 'R3', 'R4']) print(df1) |
Output: 0
1 2 3 R1 30
40 50 60 R2 45
78 58 74 R3 57
69 85 74 R4 59
65 48 52 |
Example4: WAP to
create a dataframe from a list containing 2 lists, each containing Target and
actual Sales figures of Five zonal offices. Give appropriate row labels.
Solutions:
import pandas as
pd
Target = [7500,
6400, 6582, 6589, 5964]
Sales = [4785,
5698, 8547, 7589, 9854]
ZonalSales = [Target,
Sales]
df =
pd.DataFrame(ZonalSales, columns = ['ZoneA', 'ZoneB', 'ZoneC', 'ZoneD',
'ZoneE'], index = ['Target', 'Sales'])
print(df)
Output:
ZoneA
ZoneB ZoneC ZoneD
ZoneE
Target 7500
6400 6582 6589
5964
Sales 4785
5698 8547 7589
9854
Creating a DataFrame Object a 2D ndarray
You can also pass a two-dimensional NumPy array to
DataFrame( ) to create a dataframe object.
Example5: Write a
program to create a DataFrame from a 2D array as shown below:
251 564 654
458 578 654
547 596 524
Solution: import pandas as
pd import numpy as
np ar1 =
np.array([[251, 546, 654],[458,578,654],[547,596,524]]) df =
pd.DataFrame(ar1) print(df) |
Output: 0 1 2 0 251
546 654 1 458
578 654 2 547
596 524 |
Creating a DataFrame Object from a 2D Dictionary
with values as Series Objects
You can also create a DataFrame object by using
multiple Series objects. In a 2D dictionary, you can have the values part as
Series objects and then you can pass this dictionary as argument to create a
DataFrame object.
Example6: Consider two
series object staff and salaries that store the number of
people in various office brances and salaries
distributed in these branches,
respectively.
WAP to create another series object that stores
average salary per branch
and then create a DataFrame object from these series
object.
Solution: import pandas as
pd import numpy as
np staff =pd.Series([30,
64, 45, 55]) salaries =
pd.Series([400000, 650000, 800000, 950000]) avg =
salaries/staff org =
{'People':staff, 'Amount':salaries, 'Average':avg} df =
pd.DataFrame(org) print(df) |
Output: People Amount
Average 0 30
400000 13333.333333 1 64
650000 10156.250000 2 45
800000 17777.777778 3 55
950000 17272.727273 |
DataFrame Attributes
Some common attributes of DataFrame object are explain
with the example of given below:
Example: Create
DataFrame and use the attribute one by one:
import pandas as pd
xyz = {'Sales': {'Name':'Navya', 'Age': 20, 'Gender':
'Female'},
'Marketing':{'Name':'Muskan', 'Age':21, 'Gender':'Female'}}
df1=pd.DataFrame(xyz)
Common use of attributes (*use one by one in your
program)
print(df1)
print(df1.index) #The index(row labels) of the
DataFrame.
print(df1.columns) #
The column labels of the DataFrame.
print(df1.axes) #Return a list representing both the
axes of the
dataFrame.
print(df1.dtypes) #Return
the dtypes of the data in the DataFrame.
print(len(df1)) # Return the number of rows in the
DataFrame.
print(df1.count) #Count the number of rows.
print(df1.count(axis='columns')) # Count the number of columns.
print(df1.T) #Transpose (exchange) the index and
columns.
print(df1.values) #
Return a Numpy representation of the DataFrame.
print(df1.ndim) # Return an int representing the
number of axes/array
dimensions.
Using head( ), tail( ) and display the row-wise
data in DataFrame:
Example
import pandas as pd
x = {'Populaton':[74589658, 14587458, 4589658, 5065487,5465,5263,7548],
'Hospitals':[200,405, 458, 658,5647,2531,2154],
'Schools':[7854,9548,6524,7456,4581,5624,1254]}
df = pd.DataFrame(x, index = ['Delhi','Mumbai',
'Kolkata', 'Chennai', 'Akbarpur', 'Lucknow', 'Varanasi'])
print(df)
#print(df.loc[['Delhi', 'Chennai']])
#print(df.head())
print(df.tail(2))