5 Top Secret Gems Of Python Libraries Untold To The Data Science World

Photo by krakenimages from unsplash

One of the most incredible things about practicing Python is its continuity of open-source libraries. There is a library for fundamentally anything. If you have studied some of my preceding blogs, you may have remarked that I’m a big supporter of low-code libraries. That’s not because I’m procrastinating to type code but because I fancy investing my time operating on a project with values. If a library can work a problem, why not preserve your valuable time and give it a try? Today, I will present you with five libraries you have never heard about, but you should attach them to your portfolio. Let’s get started!

-----------------------------------------------------------------------------------------------------------------------------

PS: There are lots of amazing resources out there for learning ML and data science. My personal favorite is DataCamp. This is where I started my journey and trust me it’s amazing and worth your time. So this article will have some of my favorite hand-picked courses after every topic which helped me a lot.
So this would literally be the best time to grab some yearly subscriptions(which I have) which basically have unlimited access to all the courses and other things on DataCamp and make fruitful use of your time sitting at home during this Pandemic. So go for it folks and Happy learning, make the best use of this quarantine time, and come out of this pandemic stronger and more skilled.

COURSES RELATED TO PYTHON AND DATA VISUALIZATION:

Introduction to Tableau
Intermediate Data Visualization with ggplot2
Introduction to Data Visualization with ggplot2
Introduction to Data Visualization with Matplotlib

#1.Emot

Emot is a library that has the power to fix your succeeding NLP project. It converts emojis and emoticons into detailed information. For illustration, assume that someone wrote "I ❤️ Machine Learning" on Social Media. That somebody didn’t use the term love. Alternatively, they adopted an emoji. If you apply this in an NLP project, you will have to eliminate the emoji and drop a big piece of the message. That’s when Emot appears. It changes emojis and emoticons into information. For those who are not well-known, emoticons are means to show sentiments utilizing characters. For illustration, :) for a smiley expression or :( for a sad expression.

Lets Code

To install it, you can copy pip install emot, and you are suitable to go. Then you will require to import it by copying import emot. You will want to determine if you need to figure out the significance of emojis or emoticons.

import emot
emot.emoji("I ❤ Machine Learning")

Output: {'value': ['❤'], 'mean': [':red_heart:'], 'location': [[2, 2]], 'flag': True}

You can see over that I joined the sentence I ❤️ Machine Learning🙂 and used Emot to conclude it out. Output is a dictionary with the locations, the description, and the values. You can also slice it and direct on the information that you demand.

#2.Dabl

Dabl proposes to make machine learning modelling more convenient for novices. For this purpose, it applies low-code solutions for machine learning designs. Dabl clarifies data cleaning, designing visualizations, construction baseline models, and explaining models. Let’s quickly examine its fascinating functionalities.

Lets Code

To install it, you can use pip install dabl. Then we import the dabl package using import dabl and load the diabetes dataset, which users can download here. Then we create a dataframe of the object data. Next, we use dabl.clean(db_data) it to get data regarding features, such as any unusable features. It also provides perpetual, unconditional, and high-cardinality traits.

import numpy as np
import dabl
import pandas as pd
d = 'diabetes.csv'
db_data=pd.read_csv(d)
db_data.head()
db_clean = dabl.clean(db_data, verbose=1)

OUTPUT

Detected feature types:
continuous      7
dirty_float     0
low_card_int    1
categorical     1
date            0
free_string     0
useless         0
dtype: int64

Pregnancies	Glucose	BloodPressure	SkinThickness	Insulin	BMI	DiabetesPedigreeFunction	Age	Outcome
0	2	138	62	35	0	33.6	0.127	47	1
1	0	84	82	31	125	38.2	0.233	23	0
2	0	145	0	0	0	44.2	0.630	31	1
3	0	135	68	42	250	42.3	0.365	24	1
4	1	139	62	41	480	40.7	0.536	21	0

You can practice dabl.plot(data) to produce visualizations about a particular feature:

dabl.plot(db_clean, 'Outcome')

#3.SweetViz

Sweetviz is a low-code Python library that creates appealing visualizations to uplift your exploratory data analysis with just two lines of code. The result is an interactive HTML file. You can install SweetViz using pip install sweetviz.Let’s get an excellent overview of it.

Lets Code

import sweetviz as sv
import pandas as pd
db = 'diabetes.csv'
df=pd.read_csv(db)
my_report = sv.analyze(df)
my_report.show_html() # Default arguments will generate to "SWEETVIZ_REPORT.html"

Sweetviz creates an EDA HTML file with data about the entire dataset and breaks it further so that you can investigate each point separately. You can get the statistical and definite relationship to other characters, most significant, minor, and most common values. The visualization also varies depending on the data model. In the meantime, I extremely suggest you attempt it.

#4.PyForest

When you commence transcribing your code for a project, what is your initial step? You believably import the libraries you will require. The difficulty is that you never know about numerous libraries you will lack until you necessitate it and receive an error. That’s why PyForest is one of the best libraries that I acknowledge. PyForest can import the 40 most successful libraries to your notebook with 1 line of code. Skip about trying to memorize how to call all libraries. PyForest can do this for you. In short, you install it, name it, and use it! All that in some seconds. How about the nicknames? Don’t worry concerning it. The library will import them, including the handles that we are accustomed to.

Lets Code

Use pip install pyforest to install the library, and you are safe to proceed. To import PyForest, you can use from pyforest import *, and you can begin utilizing your libraries. To investigate imported libraries, utilize the method lazy_imports().

!pip install pyforest
from pyforest import *
lazy_imports()

OUTPUT

['import sklearn',
 'import plotly as py',
 'from sklearn.ensemble import GradientBoostingClassifier',
 'import lightgbm as lgb',
 'import awswrangler as wr',
 'import keras',
 'import pickle',
 'import pandas as pd',
 'import statistics',
 'import datetime as dt',
 'import seaborn as sns',
 'import nltk',
 'from sklearn.ensemble import RandomForestClassifier',
 'import numpy as np',
 'from pyspark import SparkContext',
 'import plotly.express as px',
 'import matplotlib as mpl',
 'from sklearn.model_selection import train_test_split',
 'import dash',
 'import altair as alt',
 'import xgboost as xgb',
 'import gensim',
 'import os',
 'from sklearn.ensemble import RandomForestRegressor',
 'from sklearn import svm',
 'import tqdm',
 'from sklearn.feature_extraction.text import TfidfVectorizer',
 'import glob',
 'import tensorflow as tf',
 'import bokeh',
 'from dask import dataframe as dd',
 'from sklearn.ensemble import GradientBoostingRegressor',
 'from openpyxl import load_workbook',
 'import re',
 'import spacy',
 'from pathlib import Path',
 'import sys',
 'import matplotlib.pyplot as plt',
 'from sklearn.manifold import TSNE',
 'from sklearn.preprocessing import OneHotEncoder',
 'import plotly.graph_objs as go','import pydot']

All the libraries up are qualified to use. Technically, the notebook will only import them if you practice them unless they will not. You can recognize libraries such as Pandas, XGBoost, Plotly, Keras, Matplotlib, Sklearn, NLTK, Numpy, Seaborn, Tensorflow, etc others.

#5.Geemap

Geemap is a Python library that permits interactive mapping by using Google Earth Engine at its core. You are presumably familiar with Google Earth and all its potential, so why not practice it for your subsequent project?

Lets Code

You can install it by writing pip install geemap. To import it, you can copy import geemap. For explanation purposes, I will design a folium-based interactive map employing the resulting code:

import geemap.eefolium as geemap
Map = geemap.Map(center=[40,-100], zoom=4)

You see above is just a screenshot of the output, wherein the original output map above is an interactive plot. They have a comprehensive GitHub README discussing more how it operates and what it empowers.

COURSES RELATED TO ML:

Machine learning Scientist With R
Machine learning Scientist with Python
The above 2 are my favourite track courses for deep-diving and understanding machine learning. I am currently enrolled in the first course, and it is just excellent and so helpful.
Machine learning Toolbox in R
Pre-processing for ML in Python
Intro to ML in R
ML fundamentals in R are one of the best ML track courses, which has impressive courses.
ML Fundamentals in Python is another track course that has lots of fantastic classes inside it.
Preparing for ML interviews in R and Python
So you can either opt for individual courses or opt for track courses with a set of methods inside them. I prefer track courses because they have all the relevant systems inside them. So I insist the readers go ahead and check them out.

Summing-up

PyForest, Emot, Geemap, Dabl, and Sweetviz are libraries that deserve to be appreciated because it turns complex tasks into straightforward ones. If you use those libraries, you will keep your precious time with prices that matter.
I urge you to tackle them out and examine their functionalities that I didn’t discuss here. If you do, let me discern what you found out concerning them.

Any helpful feedback is appreciated. Also, You can connect with me on my social media profiles: Github, LinkedIn, and Medium.

Disclaimer: All the images used in this write are developed by the author and he owns the right to all images.

MODULAR MACHINE LEARNING

Search This Blog