Introduction to Streamlit: Building an Employee Attrition Dashboard
Corgi working on a Data Science project. 2023. Roman x Flux Dev
Streamlit has rapidly become a go-to tool for data scientists and developers wanting to turn data scripts into shareable web apps. Let’s explore its core features and benefits:
Key Features of Streamlit
- Simplicity: With just a few lines of Python code, you can have a running web application. No need to deal with HTML, CSS, or JavaScript unless you want to.
- Interactive Widgets: Streamlit offers out-of-the-box widgets like sliders, buttons, and text inputs that make your app interactive.
- Data Integration: It seamlessly integrates with popular data science libraries like Pandas, Numpy, Matplotlib, and others.
- Data Caching: With
@st.cache
, Streamlit caches the output of functions, ensuring your data operations are efficient and your apps remain performant. - Hot Reloading: As you save changes to your script, the app refreshes in real-time. No need to manually restart your app.
While Streamlit offers a robust platform for creating web apps, Gradio provides several distinct advantages, especially when the focus is on deploying machine learning models:
- Ease of Model Deployment: Gradio prioritizes simplifying the deployment of machine learning models. Its intuitive Python API allows for quick interface generation, ensuring accessibility even for non-experts.
- Diverse Input Types: Gradio supports a wide variety of input formats, from images and text to audio, providing flexibility especially when dealing with different data types.
- Multi-Model Deployment: Gradio stands out with its capability to simultaneously deploy multiple models, perfect for ensemble methods or side-by-side model comparisons.
- Shareability: Gradio’s feature of generating shareable URLs makes collaboration and showcasing a breeze.
- Security Features: Gradio offers built-in adversarial robustness, adding an extra layer of protection against potential adversarial attacks on deployed models.
These unique features make Gradio an attractive option for projects that focus on deploying and sharing machine learning models with diverse requirements. We are going to explore gradio after introductions to ML.
Why Choose Streamlit?
- Rapid Prototyping: Streamlit’s intuitive API and hot reloading mean you can quickly iterate and refine your app.
- Open Source: Being open-source, it boasts a strong community that contributes to its growth and offers a plethora of community plugins.
- Deployment Ready: With platforms like Streamlit Sharing, deploying your app to the world is just a click away.
- Extensible: You can integrate advanced JavaScript features or even other Python libraries to extend Streamlit’s capabilities.
Core Components of Streamlit
- Layout:
st.columns
and st.container
can be used to design your app’s layout. - Widgets: These are interactive elements like
st.slider()
, st.selectbox()
, and st.button()
that capture user input.
Display Elements
- Media: Display images, videos, or audio clips using
st.image()
, st.video()
, and st.audio()
. - Charts: Use
st.line_chart()
, st.bar_chart()
, or integrate with libraries like Altair for custom visualizations. - Tables: Showcase data with
st.table()
or st.dataframe()
.
Session State
- State: Store user data or app state across reruns with
st.session_state
.
Dive Deeper
To truly master Streamlit, it’s recommended to experiment with building various apps and exploring its official documentation. The community is active, and there’s always something new to learn!
What You’ll Build
By the end of this tutorial, you’ll have a functional dashboard that allows users to:
- Filter data based on various employee metrics.
- Visualize attrition rates and patterns across different dimensions.
- Understand insights from visualizations.
- Get actionable recommendations based on data insights.
Prerequisites
Before you begin, ensure you have the following:
- Basic knowledge of Python.
- Streamlit installed (
pip install streamlit
). - Familiarity with data visualization libraries like
altair
, matplotlib
, and seaborn
.
Getting Started
1. Data Source
We’ll use a synthetic dataset on employee attrition from this link.
2. File Structure
Ensure your working directory has the following structure:
📦Your_Directory
┣ 📜app.py
┗ 📜requirements.txt
┗ 📜data.csv
app.py
will contain our Streamlit app’s code, while requirements.txt
will list the necessary Python packages.
3. Hosting the App
Once you’ve built the app, we’ll first host it on Streamlit Cloud and then on uCloud (A Danish private cloud for universities).
Building the Dashboard
Here is the code
Initialization
We start by importing necessary libraries and loading our dataset. The data is cached using @st.cache_data
to enhance performance.
Designing the Interface
Streamlit provides intuitive functions to design the user interface:
st.title()
and st.header()
set titles and headers.st.markdown()
allows for rich text formatting.st.sidebar
lets you add interactive widgets in a sidebar for filtering.
Visualizations
Depending on user input, we visualize the filtered data using various charts. For instance, altair
is used for bar and pie charts, while matplotlib
and seaborn
provide KDE plots and boxplots.
Insights and Recommendations
Finally, expanders provide a space to share insights derived from visualizations and actionable recommendations.
Hosting
In 2024, the best free option is Hugging Face.
- Create an empty Streamlit Space.
- Log into Hugging Face from your development platform (local, uCloud, Codespaces) using the Command Line Interface (CLI) with a token that you obtain from Hugging Face. Then add your email and name for Git to connect.
- Clone your empty repository.
- Add your
app.py
and requirements.txt
files into the repository folder. - Finally, push and deploy your app.
Happy Coding! 🚀
Old corgy from last years…
Corgi working on a Data Science project. 2023. Roman x Stable Diffusion XL
Corgi working on a Data Science project. 2022. Roman x Stable Diffusion