About Dataset¶

This dataset is a collection of restaurants that are registered on Zomato in Bengaluru City. In this dataset, we have more than 50000 rows and 17 columns, a fairly large dataset. You will be able to get hands-on experience while performing the following tasks and will be able to understand how real-world problem statement analysis is done.

Data Dictionary¶

URL - url of the restaurant
Address - complete address of the restaurant
Name - name of the restaurant
online_order - Do they accept online order (Yes/No)
book_table - Can we book table at the restaurant
rate - Rating given on zomato app
votes - Number of people gave rating
phone - Phone Number of the restaurant
location - Area of the restaurant
rest_type - Restaurant Type(Casual Dining,Cafe,Quick, ETC...)

Installing dependency¶

👉Ignore It if already installed

1. !pip install numpy
2. !pip install pandas
3. !pip install matplotlib
4. !pip install seaborn

step -1 Data Preprocessing and Cleaning¶

Importing Required library¶

In [1]:

# perform linear operations
import numpy as np

# Data manipulation
import pandas as pd

#Data Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Remove warnings
import warnings
warnings.filterwarnings('ignore')

In [2]:

#Load the dataset
df=pd.read_csv(r"C:\Users\Lenovo\Documents\jupyter\DataSets\zomato.csv")

# Print top 5 rows
df.head()

Out[2]:

	url	address	name	online_order	book_table	rate	votes	phone	location	rest_type	dish_liked	cuisines	approx_cost(for two people)	reviews_list	menu_item	listed_in(type)	listed_in(city)
0	https://www.zomato.com/bangalore/jalsa-banasha...	942, 21st Main Road, 2nd Stage, Banashankari, ...	Jalsa	Yes	Yes	4.1/5	775	080 42297555\r\n+91 9743772233	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	https://www.zomato.com/bangalore/spice-elephan...	2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ...	Spice Elephant	Yes	No	4.1/5	787	080 41714161	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari
2	https://www.zomato.com/SanchurroBangalore?cont...	1112, Next to KIMS Medical College, 17th Cross...	San Churro Cafe	Yes	No	3.8/5	918	+91 9663487993	Banashankari	Cafe, Casual Dining	Churros, Cannelloni, Minestrone Soup, Hot Choc...	Cafe, Mexican, Italian	800	[('Rated 3.0', "RATED\n Ambience is not that ...	[]	Buffet	Banashankari
3	https://www.zomato.com/bangalore/addhuri-udupi...	1st Floor, Annakuteera, 3rd Stage, Banashankar...	Addhuri Udupi Bhojana	No	No	3.7/5	88	+91 9620009302	Banashankari	Quick Bites	Masala Dosa	South Indian, North Indian	300	[('Rated 4.0', "RATED\n Great food and proper...	[]	Buffet	Banashankari
4	https://www.zomato.com/bangalore/grand-village...	10, 3rd Floor, Lakshmi Associates, Gandhi Baza...	Grand Village	No	No	3.8/5	166	+91 8026612447\r\n+91 9901210005	Basavanagudi	Casual Dining	Panipuri, Gol Gappe	North Indian, Rajasthani	600	[('Rated 4.0', 'RATED\n Very good restaurant ...	[]	Buffet	Banashankari

In [3]:

# check for shape
df.shape

Out[3]:

(51717, 17)

From above cell we see that the dataset is quite large it contains 51717 observations and 17 columns

In [4]:

#Check info of each colummn
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51717 entries, 0 to 51716
Data columns (total 17 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   url                          51717 non-null  object
 1   address                      51717 non-null  object
 2   name                         51717 non-null  object
 3   online_order                 51717 non-null  object
 4   book_table                   51717 non-null  object
 5   rate                         43942 non-null  object
 6   votes                        51717 non-null  int64 
 7   phone                        50509 non-null  object
 8   location                     51696 non-null  object
 9   rest_type                    51490 non-null  object
 10  dish_liked                   23639 non-null  object
 11  cuisines                     51672 non-null  object
 12  approx_cost(for two people)  51371 non-null  object
 13  reviews_list                 51717 non-null  object
 14  menu_item                    51717 non-null  object
 15  listed_in(type)              51717 non-null  object
 16  listed_in(city)              51717 non-null  object
dtypes: int64(1), object(16)
memory usage: 6.7+ MB

From above cell we see that there are 16 object column and 1 integer

In [5]:

# Checking null values
df.isnull().sum()

Out[5]:

url                                0
address                            0
name                               0
online_order                       0
book_table                         0
rate                            7775
votes                              0
phone                           1208
location                          21
rest_type                        227
dish_liked                     28078
cuisines                          45
approx_cost(for two people)      346
reviews_list                       0
menu_item                          0
listed_in(type)                    0
listed_in(city)                    0
dtype: int64

From above cell we see that there are missing values in our data in rate,phone,location,rest_type,dish_liked,cuisines,approx_cost(for two people) columns

In [6]:

# check for duplicate
df.duplicated().sum()

Out[6]:

From above cell we see that there are no duplicates present in our dataset

In [7]:

df.columns

Out[7]:

Index(['url', 'address', 'name', 'online_order', 'book_table', 'rate', 'votes',
       'phone', 'location', 'rest_type', 'dish_liked', 'cuisines',
       'approx_cost(for two people)', 'reviews_list', 'menu_item',
       'listed_in(type)', 'listed_in(city)'],
      dtype='object')

In [8]:

# Drop columns
df1=df.drop(columns=['url', 'address','phone'])
df1.head()

Out[8]:

	name	online_order	book_table	rate	votes	location	rest_type	dish_liked	cuisines	approx_cost(for two people)	reviews_list	menu_item	listed_in(type)	listed_in(city)
0	Jalsa	Yes	Yes	4.1/5	775	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	Spice Elephant	Yes	No	4.1/5	787	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari
2	San Churro Cafe	Yes	No	3.8/5	918	Banashankari	Cafe, Casual Dining	Churros, Cannelloni, Minestrone Soup, Hot Choc...	Cafe, Mexican, Italian	800	[('Rated 3.0', "RATED\n Ambience is not that ...	[]	Buffet	Banashankari
3	Addhuri Udupi Bhojana	No	No	3.7/5	88	Banashankari	Quick Bites	Masala Dosa	South Indian, North Indian	300	[('Rated 4.0', "RATED\n Great food and proper...	[]	Buffet	Banashankari
4	Grand Village	No	No	3.8/5	166	Basavanagudi	Casual Dining	Panipuri, Gol Gappe	North Indian, Rajasthani	600	[('Rated 4.0', 'RATED\n Very good restaurant ...	[]	Buffet	Banashankari

In above cell we remove unnecessary column and create a new dataframe df1

In [9]:

# Rename Columns
df1.rename(columns={'approx_cost(for two people)':'cost','listed_in(type)':'service','listed_in(city)':'city'},inplace=True)
df1

Out[9]:

	name	online_order	book_table	rate	votes	location	rest_type	dish_liked	cuisines	cost	reviews_list	menu_item	service	city
0	Jalsa	Yes	Yes	4.1/5	775	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	Spice Elephant	Yes	No	4.1/5	787	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari
2	San Churro Cafe	Yes	No	3.8/5	918	Banashankari	Cafe, Casual Dining	Churros, Cannelloni, Minestrone Soup, Hot Choc...	Cafe, Mexican, Italian	800	[('Rated 3.0', "RATED\n Ambience is not that ...	[]	Buffet	Banashankari
3	Addhuri Udupi Bhojana	No	No	3.7/5	88	Banashankari	Quick Bites	Masala Dosa	South Indian, North Indian	300	[('Rated 4.0', "RATED\n Great food and proper...	[]	Buffet	Banashankari
4	Grand Village	No	No	3.8/5	166	Basavanagudi	Casual Dining	Panipuri, Gol Gappe	North Indian, Rajasthani	600	[('Rated 4.0', 'RATED\n Very good restaurant ...	[]	Buffet	Banashankari
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
51712	Best Brews - Four Points by Sheraton Bengaluru...	No	No	3.6 /5	27	Whitefield	Bar	NaN	Continental	1,500	[('Rated 5.0', "RATED\n Food and service are ...	[]	Pubs and bars	Whitefield
51713	Vinod Bar And Restaurant	No	No	NaN	0	Whitefield	Bar	NaN	Finger Food	600	[]	[]	Pubs and bars	Whitefield
51714	Plunge - Sheraton Grand Bengaluru Whitefield H...	No	No	NaN	0	Whitefield	Bar	NaN	Finger Food	2,000	[]	[]	Pubs and bars	Whitefield
51715	Chime - Sheraton Grand Bengaluru Whitefield Ho...	No	Yes	4.3 /5	236	ITPL Main Road, Whitefield	Bar	Cocktails, Pizza, Buttermilk	Finger Food	2,500	[('Rated 4.0', 'RATED\n Nice and friendly pla...	[]	Pubs and bars	Whitefield
51716	The Nest - The Den Bengaluru	No	No	3.4 /5	13	ITPL Main Road, Whitefield	Bar, Casual Dining	NaN	Finger Food, North Indian, Continental	1,500	[('Rated 5.0', 'RATED\n Great ambience , look...	[]	Pubs and bars	Whitefield

51717 rows × 14 columns

In above cell we rename column name for better readability

In [10]:

#Check for null values
df1.isna().sum()

Out[10]:

name                0
online_order        0
book_table          0
rate             7775
votes               0
location           21
rest_type         227
dish_liked      28078
cuisines           45
cost              346
reviews_list        0
menu_item           0
service             0
city                0
dtype: int64

Data Cleaning¶

In [11]:

#create a threshold of our data
threshold = len(df1) * 0.05
threshold

Out[11]:

2585.8500000000004

In above cell we create a threshold for dropping null values we drop the columns null values whose columns have null less than 5% of our data

In [12]:

cols_to_drop = df1.columns[df1.isna().sum() <= threshold]
cols_to_drop

Out[12]:

Index(['name', 'online_order', 'book_table', 'votes', 'location', 'rest_type',
       'cuisines', 'cost', 'reviews_list', 'menu_item', 'service', 'city'],
      dtype='object')

In [13]:

#Dropping null values
df1.dropna(subset=cols_to_drop,inplace=True)

In [14]:

#again check for null values
df1.isna().sum()

Out[14]:

name                0
online_order        0
book_table          0
rate             7615
votes               0
location            0
rest_type           0
dish_liked      27713
cuisines            0
cost                0
reviews_list        0
menu_item           0
service             0
city                0
dtype: int64

After removing null values we have two columns that have missing values so we have to fill it let's fill the missing values

Filling missing values in `Biryani` Column¶

In [15]:

df1.dish_liked.mode()
df1.dish_liked.fillna('Biryani',inplace=True)

In above cell we fill missing value in dish_liked column, the dish_liked column contain categorical data so we fill the missing value uisng mode

Clean the data of the `rate` column¶

In [16]:

df1.rate.unique()

Out[16]:

array(['4.1/5', '3.8/5', '3.7/5', '3.6/5', '4.6/5', '4.0/5', '4.2/5',
       '3.9/5', '3.1/5', '3.0/5', '3.2/5', '3.3/5', '2.8/5', '4.4/5',
       '4.3/5', 'NEW', '2.9/5', '3.5/5', nan, '2.6/5', '3.8 /5', '3.4/5',
       '4.5/5', '2.5/5', '2.7/5', '4.7/5', '2.4/5', '2.2/5', '2.3/5',
       '3.4 /5', '-', '3.6 /5', '4.8/5', '3.9 /5', '4.2 /5', '4.0 /5',
       '4.1 /5', '3.7 /5', '3.1 /5', '2.9 /5', '3.3 /5', '2.8 /5',
       '3.5 /5', '2.7 /5', '2.5 /5', '3.2 /5', '2.6 /5', '4.5 /5',
       '4.3 /5', '4.4 /5', '4.9/5', '2.1/5', '2.0/5', '1.8/5', '4.6 /5',
       '4.9 /5', '3.0 /5', '4.8 /5', '2.3 /5', '4.7 /5', '2.4 /5',
       '2.1 /5', '2.2 /5', '2.0 /5', '1.8 /5'], dtype=object)

In [17]:

df1['rate']= df1['rate'].str.strip('-')
df1['rate']= df1['rate'].str.strip('NEW')
df1['rate']= df1['rate'].str.replace('/5','')
df1['rate']= df1['rate'].str.replace(' ','')
df1['rate'].unique()

Out[17]:

array(['4.1', '3.8', '3.7', '3.6', '4.6', '4.0', '4.2', '3.9', '3.1',
       '3.0', '3.2', '3.3', '2.8', '4.4', '4.3', '', '2.9', '3.5', nan,
       '2.6', '3.4', '4.5', '2.5', '2.7', '4.7', '2.4', '2.2', '2.3',
       '4.8', '4.9', '2.1', '2.0', '1.8'], dtype=object)

In [18]:

df1.rate=pd.to_numeric(df1.rate)

In [19]:

df1.rate.fillna(df1.rate.mean(),inplace=True)

In [20]:

df1.isna().sum()

Out[20]:

name            0
online_order    0
book_table      0
rate            0
votes           0
location        0
rest_type       0
dish_liked      0
cuisines        0
cost            0
reviews_list    0
menu_item       0
service         0
city            0
dtype: int64

In [21]:

df1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 51148 entries, 0 to 51716
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   name          51148 non-null  object 
 1   online_order  51148 non-null  object 
 2   book_table    51148 non-null  object 
 3   rate          51148 non-null  float64
 4   votes         51148 non-null  int64  
 5   location      51148 non-null  object 
 6   rest_type     51148 non-null  object 
 7   dish_liked    51148 non-null  object 
 8   cuisines      51148 non-null  object 
 9   cost          51148 non-null  object 
 10  reviews_list  51148 non-null  object 
 11  menu_item     51148 non-null  object 
 12  service       51148 non-null  object 
 13  city          51148 non-null  object 
dtypes: float64(1), int64(1), object(12)
memory usage: 5.9+ MB

Clean the data of the `cost` column¶

In [22]:

df1.cost.unique()

Out[22]:

array(['800', '300', '600', '700', '550', '500', '450', '650', '400',
       '900', '200', '750', '150', '850', '100', '1,200', '350', '250',
       '950', '1,000', '1,500', '1,300', '199', '80', '1,100', '160',
       '1,600', '230', '130', '50', '190', '1,700', '1,400', '180',
       '1,350', '2,200', '2,000', '1,800', '1,900', '330', '2,500',
       '2,100', '3,000', '2,800', '3,400', '40', '1,250', '3,500',
       '4,000', '2,400', '2,600', '120', '1,450', '469', '70', '3,200',
       '60', '560', '240', '360', '6,000', '1,050', '2,300', '4,100',
       '5,000', '3,700', '1,650', '2,700', '4,500', '140'], dtype=object)

In [23]:

df1.cost=df1.cost.str.replace(",","")

In [24]:

df1.cost=df1.cost.astype(int)
df1.cost.unique()

Out[24]:

array([ 800,  300,  600,  700,  550,  500,  450,  650,  400,  900,  200,
        750,  150,  850,  100, 1200,  350,  250,  950, 1000, 1500, 1300,
        199,   80, 1100,  160, 1600,  230,  130,   50,  190, 1700, 1400,
        180, 1350, 2200, 2000, 1800, 1900,  330, 2500, 2100, 3000, 2800,
       3400,   40, 1250, 3500, 4000, 2400, 2600,  120, 1450,  469,   70,
       3200,   60,  560,  240,  360, 6000, 1050, 2300, 4100, 5000, 3700,
       1650, 2700, 4500,  140])

In [25]:

df1.cost.dtype

Out[25]:

dtype('int32')

In [26]:

df1.votes=pd.to_numeric(df1.votes)
df1.votes

Out[26]:

0        775
1        787
2        918
3         88
4        166
        ... 
51712     27
51713      0
51714      0
51715    236
51716     13
Name: votes, Length: 51148, dtype: int64

Step -2 Data analysis¶

In [27]:

df1

Out[27]:

	name	online_order	book_table	rate	votes	location	rest_type	dish_liked	cuisines	cost	reviews_list	menu_item	service	city
0	Jalsa	Yes	Yes	4.100000	775	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	Spice Elephant	Yes	No	4.100000	787	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari
2	San Churro Cafe	Yes	No	3.800000	918	Banashankari	Cafe, Casual Dining	Churros, Cannelloni, Minestrone Soup, Hot Choc...	Cafe, Mexican, Italian	800	[('Rated 3.0', "RATED\n Ambience is not that ...	[]	Buffet	Banashankari
3	Addhuri Udupi Bhojana	No	No	3.700000	88	Banashankari	Quick Bites	Masala Dosa	South Indian, North Indian	300	[('Rated 4.0', "RATED\n Great food and proper...	[]	Buffet	Banashankari
4	Grand Village	No	No	3.800000	166	Basavanagudi	Casual Dining	Panipuri, Gol Gappe	North Indian, Rajasthani	600	[('Rated 4.0', 'RATED\n Very good restaurant ...	[]	Buffet	Banashankari
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
51712	Best Brews - Four Points by Sheraton Bengaluru...	No	No	3.600000	27	Whitefield	Bar	Biryani	Continental	1500	[('Rated 5.0', "RATED\n Food and service are ...	[]	Pubs and bars	Whitefield
51713	Vinod Bar And Restaurant	No	No	3.702011	0	Whitefield	Bar	Biryani	Finger Food	600	[]	[]	Pubs and bars	Whitefield
51714	Plunge - Sheraton Grand Bengaluru Whitefield H...	No	No	3.702011	0	Whitefield	Bar	Biryani	Finger Food	2000	[]	[]	Pubs and bars	Whitefield
51715	Chime - Sheraton Grand Bengaluru Whitefield Ho...	No	Yes	4.300000	236	ITPL Main Road, Whitefield	Bar	Cocktails, Pizza, Buttermilk	Finger Food	2500	[('Rated 4.0', 'RATED\n Nice and friendly pla...	[]	Pubs and bars	Whitefield
51716	The Nest - The Den Bengaluru	No	No	3.400000	13	ITPL Main Road, Whitefield	Bar, Casual Dining	Biryani	Finger Food, North Indian, Continental	1500	[('Rated 5.0', 'RATED\n Great ambience , look...	[]	Pubs and bars	Whitefield

51148 rows × 14 columns

Q1 Restaurants delivering Online or not¶

In [28]:

x = df1["online_order"].value_counts()
y = x.index
plt.pie(x=x, labels=y,colors = ['lightcoral', 'lightskyblue'], autopct='%.0f%%',explode = (0, 0.1),shadow=True)
plt.title("Online order Percentage",color="black");

No description has been provided for this image

From above pie plot we can see that according to our data 59% restaurant allow online delivery while 41% restaurant not allowed online delivery

Q2 Restaurants allowing table booking or not¶

In [29]:

x = df1["book_table"].value_counts()
y = x.index
plt.pie(x=x, labels=y,colors = ['#99ff99', '#ffcc99'], autopct='%.0f%%',explode = (0, 0.1),shadow=True)
plt.title("Table Booking ratio",color="black");

From above pie plot we observe that 87% restaurant in bangalore not allowed table booking online while 13% restaurant allowed table booking online

Q3 Table booking Rate vs Rate¶

In [30]:

sns.set_style('darkgrid')
sns.barplot(data=df1,x='book_table',y='rate')
plt.show()

The bar plot above indicates that restaurants allowing online table bookings tend to receive higher ratings compared to those that do not offer this service.

Q4 Best Location¶

In [31]:

df1.location.unique()

Out[31]:

array(['Banashankari', 'Basavanagudi', 'Mysore Road', 'Jayanagar',
       'Kumaraswamy Layout', 'Rajarajeshwari Nagar', 'Vijay Nagar',
       'Uttarahalli', 'JP Nagar', 'South Bangalore', 'City Market',
       'Nagarbhavi', 'Bannerghatta Road', 'BTM', 'Kanakapura Road',
       'Bommanahalli', 'CV Raman Nagar', 'Electronic City', 'HSR',
       'Marathahalli', 'Wilson Garden', 'Shanti Nagar',
       'Koramangala 5th Block', 'Koramangala 8th Block', 'Richmond Road',
       'Koramangala 7th Block', 'Jalahalli', 'Koramangala 4th Block',
       'Bellandur', 'Sarjapur Road', 'Whitefield', 'East Bangalore',
       'Old Airport Road', 'Indiranagar', 'Koramangala 1st Block',
       'Frazer Town', 'RT Nagar', 'MG Road', 'Brigade Road',
       'Lavelle Road', 'Church Street', 'Ulsoor', 'Residency Road',
       'Shivajinagar', 'Infantry Road', 'St. Marks Road',
       'Cunningham Road', 'Race Course Road', 'Commercial Street',
       'Vasanth Nagar', 'HBR Layout', 'Domlur', 'Ejipura',
       'Jeevan Bhima Nagar', 'Old Madras Road', 'Malleshwaram',
       'Seshadripuram', 'Kammanahalli', 'Koramangala 6th Block',
       'Majestic', 'Langford Town', 'Central Bangalore', 'Sanjay Nagar',
       'Brookefield', 'ITPL Main Road, Whitefield',
       'Varthur Main Road, Whitefield', 'KR Puram',
       'Koramangala 2nd Block', 'Koramangala 3rd Block', 'Koramangala',
       'Hosur Road', 'Rajajinagar', 'Banaswadi', 'North Bangalore',
       'Nagawara', 'Hennur', 'Kalyan Nagar', 'New BEL Road', 'Jakkur',
       'Rammurthy Nagar', 'Thippasandra', 'Kaggadasapura', 'Hebbal',
       'Kengeri', 'Sankey Road', 'Sadashiv Nagar', 'Basaveshwara Nagar',
       'Yeshwantpur', 'West Bangalore', 'Magadi Road', 'Yelahanka',
       'Sahakara Nagar', 'Peenya'], dtype=object)

In [32]:

x=df1.groupby('location')[['rate','votes']].mean()
x.sort_values(by=['rate','votes'],ascending=False)
x=x.head(10)
l=x.index.to_list()
x

Out[32]:

	rate	votes
location
BTM	3.602110	113.204102
Banashankari	3.659095	179.617257
Banaswadi	3.552800	54.131783
Bannerghatta Road	3.555788	133.463066
Basavanagudi	3.675116	138.770468
Basaveshwara Nagar	3.664752	95.962567
Bellandur	3.568176	161.537372
Bommanahalli	3.390191	32.639831
Brigade Road	3.704790	352.725780
Brookefield	3.617267	181.344512

In [53]:

sns.barplot(x=x.index,y=x.rate,data=x,palette='magma')
plt.xticks(rotation=90)
plt.title("Best location by rating")
plt.show()

above bar plot show us best location by rating according to our data one of the best location is BTM and second one is Banashankari and so on...

In [54]:

sns.barplot(x=x.index,y=x.votes,data=x,palette='viridis')

plt.xticks(rotation=90)
plt.title("Best location by votes")
plt.show()

The bar plot above displays the locations with the highest number of votes for restaurants. According to our data, Brigade Road is the best location, followed by Brookfield and others in descending order.

Q5 Relation between Location and Rating¶

In [35]:

plt.figure(figsize=(15, 6))
sns.scatterplot(x=df1.location,y=df1.rate)
plt.xticks(rotation=90)

plt.show()

Q6 Restaurant Type¶

In [36]:

df1.head(2)

Out[36]:

	name	online_order	book_table	rate	votes	location	rest_type	dish_liked	cuisines	cost	reviews_list	menu_item	service	city
0	Jalsa	Yes	Yes	4.1	775	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	Spice Elephant	Yes	No	4.1	787	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari

In [37]:

df1.rest_type.unique()

Out[37]:

array(['Casual Dining', 'Cafe, Casual Dining', 'Quick Bites',
       'Casual Dining, Cafe', 'Cafe', 'Quick Bites, Cafe',
       'Cafe, Quick Bites', 'Delivery', 'Mess', 'Dessert Parlor',
       'Bakery, Dessert Parlor', 'Pub', 'Bakery', 'Takeaway, Delivery',
       'Fine Dining', 'Beverage Shop', 'Sweet Shop', 'Bar',
       'Beverage Shop, Quick Bites', 'Confectionery',
       'Quick Bites, Beverage Shop', 'Dessert Parlor, Sweet Shop',
       'Bakery, Quick Bites', 'Sweet Shop, Quick Bites', 'Kiosk',
       'Food Truck', 'Quick Bites, Dessert Parlor',
       'Beverage Shop, Dessert Parlor', 'Takeaway', 'Pub, Casual Dining',
       'Casual Dining, Bar', 'Dessert Parlor, Beverage Shop',
       'Quick Bites, Bakery', 'Dessert Parlor, Quick Bites',
       'Microbrewery, Casual Dining', 'Lounge', 'Bar, Casual Dining',
       'Food Court', 'Cafe, Bakery', 'Dhaba', 'Quick Bites, Sweet Shop',
       'Microbrewery', 'Food Court, Quick Bites', 'Pub, Bar',
       'Casual Dining, Pub', 'Lounge, Bar', 'Food Court, Dessert Parlor',
       'Casual Dining, Sweet Shop', 'Food Court, Casual Dining',
       'Casual Dining, Microbrewery', 'Sweet Shop, Dessert Parlor',
       'Bakery, Beverage Shop', 'Lounge, Casual Dining',
       'Cafe, Food Court', 'Beverage Shop, Cafe', 'Cafe, Dessert Parlor',
       'Dessert Parlor, Cafe', 'Dessert Parlor, Bakery',
       'Microbrewery, Pub', 'Bakery, Food Court', 'Club',
       'Quick Bites, Food Court', 'Bakery, Cafe', 'Bar, Cafe',
       'Pub, Cafe', 'Casual Dining, Irani Cafee', 'Fine Dining, Lounge',
       'Bar, Quick Bites', 'Bakery, Kiosk', 'Pub, Microbrewery',
       'Microbrewery, Lounge', 'Fine Dining, Microbrewery',
       'Fine Dining, Bar', 'Mess, Quick Bites', 'Dessert Parlor, Kiosk',
       'Bhojanalya', 'Casual Dining, Quick Bites', 'Pop Up', 'Cafe, Bar',
       'Casual Dining, Lounge', 'Bakery, Sweet Shop', 'Microbrewery, Bar',
       'Cafe, Lounge', 'Bar, Pub', 'Lounge, Cafe', 'Club, Casual Dining',
       'Quick Bites, Mess', 'Quick Bites, Meat Shop',
       'Quick Bites, Kiosk', 'Lounge, Microbrewery',
       'Food Court, Beverage Shop', 'Dessert Parlor, Food Court',
       'Bar, Lounge'], dtype=object)

Q7 Relation between Rest type and Rating¶

In [38]:

top_rest_by_rating=df1.groupby('rest_type')['rate'].max().sort_values(ascending=False).head(10)
top_rest_by_rating

Out[38]:

rest_type
Microbrewery           4.9
Dessert Parlor         4.9
Casual Dining, Bar     4.9
Casual Dining          4.9
Bakery                 4.8
Quick Bites            4.8
Pub, Microbrewery      4.8
Pub                    4.8
Bar                    4.8
Cafe, Casual Dining    4.7
Name: rate, dtype: float64

In [39]:

plt.figure(figsize=(8, 6))

plt.barh(top_rest_by_rating.index, top_rest_by_rating, color = ['#1f77b4','#ff7f0e','#2ca02c','#d62728','#9467bd','#8c564b','#e377c2','#7f7f7f','#bcbd22','#17becf']
)


plt.title("Top 10 rest type by rating")
plt.xticks(rotation=90)
plt.show()

Above horizontal bar Displays the best restaurant type by rating According to this plot the Casual Dining, Casual Dining, Bar,Dessert Parlour and Microbrewery are the best restaurant type

Q8 Types of Services¶

In [40]:

df1.service.unique()

Out[40]:

array(['Buffet', 'Cafes', 'Delivery', 'Desserts', 'Dine-out',
       'Drinks & nightlife', 'Pubs and bars'], dtype=object)

In [41]:

sns.countplot(x="service",data=df1)
plt.title("Types of services")
plt.xticks(rotation=60)
plt.show()

Above Bar plot shows us Various types of services that is offered by restaurants

Q9 Relation between service and Rating¶

In [42]:

service_type=df1.groupby('service')[['rate']].mean().sort_values(by=['rate'],ascending=False).head(10)
service_type

Out[42]:

	rate
service
Drinks & nightlife	4.003110
Pubs and bars	3.996530
Buffet	3.973991
Cafes	3.852957
Desserts	3.760666
Dine-out	3.686423
Delivery	3.664599

In [43]:

color=['#d62728','#9467bd','#8c564b','#17becf','#e377c2','#7f7f7f',
    '#bcbd22']
sns.set_palette(color)
sns.barplot(x=service_type.index,y='rate',data=service_type)
plt.xticks(rotation=90)
plt.title("Top services by rating")
plt.show()

Above Bar plot indicates that the Drink & nightlife and Pubs and bars are the best type of services by rating that is offered by restaurant in Bangalore

Q10 Cost of Restaurant¶

In [44]:

df1.head(2)

Out[44]:

	name	online_order	book_table	rate	votes	location	rest_type	dish_liked	cuisines	cost	reviews_list	menu_item	service	city
0	Jalsa	Yes	Yes	4.1	775	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	Spice Elephant	Yes	No	4.1	787	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari

In [45]:

df1.name.value_counts().sort_values(ascending=False)

Out[45]:

Cafe Coffee Day                                            96
Onesta                                                     85
Just Bake                                                  73
Empire Restaurant                                          71
Five Star Chicken                                          70
                                                           ..
Tango                                                       1
Annapurna Veg                                               1
Venkat Naidu Restaurant                                     1
A-one Dum Biryani                                           1
Plunge - Sheraton Grand Bengaluru Whitefield Hotel &...     1
Name: name, Length: 8723, dtype: int64

In [46]:

df1.cost.unique()

Out[46]:

array([ 800,  300,  600,  700,  550,  500,  450,  650,  400,  900,  200,
        750,  150,  850,  100, 1200,  350,  250,  950, 1000, 1500, 1300,
        199,   80, 1100,  160, 1600,  230,  130,   50,  190, 1700, 1400,
        180, 1350, 2200, 2000, 1800, 1900,  330, 2500, 2100, 3000, 2800,
       3400,   40, 1250, 3500, 4000, 2400, 2600,  120, 1450,  469,   70,
       3200,   60,  560,  240,  360, 6000, 1050, 2300, 4100, 5000, 3700,
       1650, 2700, 4500,  140])

In [47]:

expensive_restaurant=df1.groupby('name')['cost'].max().sort_values(ascending=False).head(10)
expensive_restaurant

Out[47]:

name
Le Cirque Signature - The Leela Palace    6000
Royal Afghan - ITC Windsor                5000
Malties - Radisson Blu                    4500
La Brasserie - Le Meridien                4100
Dakshin - ITC Windsor                     4000
Alba - JW Marriott Bengaluru              4000
Edo Restaurant & Bar - ITC Gardenia       4000
Dum Pukht Jolly Nabobs - ITC Windsor      4000
Riwaz - The Ritz-Carlton                  4000
Grill 99 - The Ritz-Carlton               4000
Name: cost, dtype: int32

In [48]:

sns.barplot(x=expensive_restaurant.index,y=expensive_restaurant)
plt.xticks(rotation=90)
plt.title("Top 10 expensive restaurant")
plt.show()

From above bar plot we can observe that the most expensive retaurant of bangalore is Le Cirque Signature - The Leela Palace and the second most is Royal Afghan - ITC Windsor and so on...

Q11 Number of restaurants in a Location¶

In [49]:

df1.head(2)

Out[49]:

	name	online_order	book_table	rate	votes	location	rest_type	dish_liked	cuisines	cost	reviews_list	menu_item	service	city
0	Jalsa	Yes	Yes	4.1	775	Banashankari	Casual Dining	Pasta, Lunch Buffet, Masala Papad, Paneer Laja...	North Indian, Mughlai, Chinese	800	[('Rated 4.0', 'RATED\n A beautiful place to ...	[]	Buffet	Banashankari
1	Spice Elephant	Yes	No	4.1	787	Banashankari	Casual Dining	Momos, Lunch Buffet, Chocolate Nirvana, Thai G...	Chinese, North Indian, Thai	800	[('Rated 4.0', 'RATED\n Had been here for din...	[]	Buffet	Banashankari

In [50]:

no_of_rest=df1.location.value_counts().head(10)
no_of_rest

Out[50]:

BTM                      5071
HSR                      2496
Koramangala 5th Block    2481
JP Nagar                 2219
Whitefield               2109
Indiranagar              2033
Jayanagar                1916
Marathahalli             1808
Bannerghatta Road        1611
Bellandur                1271
Name: location, dtype: int64

In [51]:

sns.barplot(x=no_of_rest.index,y=no_of_rest,palette='colorblind')
plt.xticks(rotation=90)
plt.title("Top 10 location by number of restaurant")
plt.show()

Above bar plot indicates that most of the restaurant in bangalore is situated in BTM and second most in HSR and so on...

Recommendations and Conclusions:¶

Based on the comprehensive analysis of the Bengaluru restaurant dataset, several key takeaways and actionable recommendations for restaurant owners and stakeholders have emerged:

Location Choice: Given the significant influence of location on restaurant ratings and popularity, stakeholders should consider setting up or investing in restaurants in areas such as Brigade Road, Brookfield, and other high-rated locations to maximize their business potential.

Online Services: Offering online delivery services and enabling online table bookings appear to positively impact restaurant ratings. Therefore, it is recommended that restaurant owners prioritize integrating these services to enhance customer convenience and satisfaction.

Cuisine Variety: Considering the prevalence of North Indian, Chinese, and South Indian cuisines in Bengaluru, diversifying menu offerings with a blend of these popular cuisines could attract a wider customer base and increase overall patronage.

Price Sensitivity: While maintaining competitive pricing is crucial, higher-priced restaurants should ensure that the quality of their offerings justifies the costs to meet customer expectations and satisfaction.

Service Quality: Maintaining high service standards is essential, as it significantly contributes to overall customer experiences and, subsequently, restaurant ratings. Continuous staff training and regular quality checks are recommended to ensure consistent and exceptional service.

Customer Engagement: Encouraging customer feedback through various platforms and actively responding to reviews can foster positive relationships with customers, demonstrate responsiveness, and build a strong and loyal customer base.

Market Differentiation: Understanding the specific preferences of the target audience and tailoring menus, services, and ambience to meet these preferences can help restaurants stand out in the competitive Bengaluru market and attract a dedicated customer following.

Questions -¶

How Many restaurants delivering Online ?
How many restaurants allowing table booking online?
Table booking Rate vs Rate
Best Location
Relation between Location and Rating
Restaurant Type
Gaussian Rest type and Rating
Types of Services
Relation between Type and Rating
Cost of Restaurant
No. of restaurants in a Location

Practice Questions¶

Is there any correlation between the length of a review and the associated rating?
Can you identify any patterns in the most-liked dishes for different cuisines?
How are the restaurant costs distributed across the dataset?
What is the distribution of ratings in the dataset?
Find the location of top 5 most expensive restaurant?

In [ ]:

Exploratory Data Analysis on Bengaluru Restaurant Data in Python

About Dataset¶

Data Dictionary¶

Installing dependency¶

step -1 Data Preprocessing and Cleaning¶

Importing Required library¶

Data Cleaning¶

Filling missing values in `Biryani` Column¶

Clean the data of the `rate` column¶

Clean the data of the `cost` column¶

Step -2 Data analysis¶

Q1 Restaurants delivering Online or not¶

Q2 Restaurants allowing table booking or not¶

Q3 Table booking Rate vs Rate¶

Q4 Best Location¶

Q5 Relation between Location and Rating¶

Q6 Restaurant Type¶

Q7 Relation between Rest type and Rating¶

Q8 Types of Services¶

Q9 Relation between service and Rating¶

Q10 Cost of Restaurant¶

Q11 Number of restaurants in a Location¶

Recommendations and Conclusions:¶

Questions -¶

Practice Questions¶

Talk to our Industry Experts for Career Counselling

Company

Platform

Resources

Get in touch

Exploratory Data Analysis on Bengaluru Restaurant Data in Python

About Dataset¶

Data Dictionary¶

Installing dependency¶

step -1 Data Preprocessing and Cleaning¶

Importing Required library¶

Data Cleaning¶

Filling missing values in Biryani Column¶

Clean the data of the rate column¶

Clean the data of the cost column¶

Step -2 Data analysis¶

Q1 Restaurants delivering Online or not¶

Q2 Restaurants allowing table booking or not¶

Q3 Table booking Rate vs Rate¶

Q4 Best Location¶

Q5 Relation between Location and Rating¶

Q6 Restaurant Type¶

Q7 Relation between Rest type and Rating¶

Q8 Types of Services¶

Q9 Relation between service and Rating¶

Q10 Cost of Restaurant¶

Q11 Number of restaurants in a Location¶

Recommendations and Conclusions:¶

Questions -¶

Practice Questions¶

Talk to our Industry Experts for Career Counselling

Company

Platform

Resources

Get in touch

Filling missing values in `Biryani` Column¶

Clean the data of the `rate` column¶

Clean the data of the `cost` column¶