Accelerate Your Data Science Skills with These Ultimate ChatGPT Prompts

Share the Knowledge

With its ability to understand natural language and generate text that sounds like it was written by a human, ChatGPT has a wide range of potential applications in the field of data science.

In this article, we will explore how data scientists can use ChatGPT prompts to streamline their workflow and get more done in less time.

In particular, we will focus on several key areas where ChatGPT can be particularly useful, including writing and optimizing code, explaining code and concepts, and generating new ideas for analysis.

Prompts for Writing Python

  1. I want you to act as a Python code generator and create a function that will do [task].
  2. I want you to act as a Python script writer and write a program that will scrape [data source] data from a website.
  3. I want you to act as a Python developer and write a module that will calculate [metric] using [dataset].

Prompts for Anomaly Detection

  1. I want you to act as a data scientist and detect [anomalies] in the [network traffic] of [organization] using [machine learning] algorithms.
  2. I want you to act as a security analyst and identify [intrusions] in the [system logs] of [server] using [anomaly detection] techniques.
  3. I want you to act as a fraud analyst and detect [fraudulent transactions] in the [financial data] of [company] using [statistical analysis] methods.

Prompts for Automatic Machine Learning

  1. I want you to act as an automatic machine learning (AutoML) bot using TPOT for me. I am working on a model that predicts […]. Please write python code to find the best classification model with the highest AUC score on the test set.
  2. I want you to act as an AutoML system and generate Python code to build a machine learning pipeline that optimizes [metric] on [dataset].
  3. I want you to act as an ML engineer and create an AutoML script that tunes [hyperparameters] to achieve the best performance on [dataset].
  4. I want you to act as a data scientist and use Auto-sklearn to automatically build a classification model that predicts [target variable] based on [features] features.

Prompts to Train Classification Model

  1. I want you to act as a data scientist and code for me. I have a dataset of [describe dataset]. Please build a machine learning model that predict [target variable].
  2. I want you to act as a data scientist and train a classification model to predict [target variable] based on [features] dataset.
  3. I want you to act as a machine learning engineer and build a classification model that can classify [label] based on [features] features.
  4. I want you to act as a deep learning specialist and train a convolutional neural network to classify [object] using [image format] images.

Prompts to Compare Function Speed

  1. I want you to act as a software developer. I would like to compare the efficiency of two algorithms that performs the same thing in python. Please write code that helps me run an experiment that can be repeated for 5 times. Please output the runtime and other summary statistics of the experiment. [Insert functions]
  2. I want you to act as a performance tester and compare the speed of [function1] and [function2] when processing [input data] in [Python script].
  3. I want you to act as a data scientist and compare the speed of different [machine learning algorithms] on [dataset] using the [timeit] module.
  4. I want you to act as a speed optimizer and compare the speed of different [Python libraries] for [task] in [code snippet].

Prompts for Creating NumPy Array

  1. I want you to act as a data scientist. I need to create a numpy array. This numpy array should have the shape of (x,y,z). Please initialize the numpy array with random values.
  2. I want you to act as a data scientist and create a 1D NumPy array of [length] that contains [values].
  3. I want you to act as a Python developer and create a 2D NumPy array of shape [row, column] that represents the [matrix] in [dataset].
  4. I want you to act as a machine learning expert and create a random 3D NumPy array of shape [batch_size, height, width] that simulates [image data].

Prompts for Clustering

  1. I want you to act as a data scientist and cluster the [customers] in [dataset] into [n] groups based on their [purchase history].
  2. I want you to act as a machine learning expert and develop a [clustering model] that groups the [documents] in [dataset] based on their [content].
  3. I want you to act as a data analyst and visualize the [clusters] in [dataset] using [dimensionality reduction] techniques.

Prompts for Dimensionality Reduction

  1. I want you to act as a data scientist and reduce the [dimensionality] of the [image data] in [dataset] using [principal component analysis] technique.
  2. I want you to act as a data scientist and provide a step-by-step guide on how to perform [t-SNE] for my dataset.
  3. I want you to act as a data scientist and explain the difference between [PCA] and [LDA] and how they can be used for [dimensionality reduction] in my dataset.

Prompts to Tune Hyperparameter

  1. I want you to act as a data scientist and code for me. I have trained a [model name]. Please write the code to tune the hyper parameters.
  2. I want you to act as a hyperparameter tuner and optimize the [hyperparameter] of a [algorithm] algorithm to achieve the highest [metric] on [dataset].
  3. I want you to act as a machine learning expert and use Optuna to perform a Bayesian optimization of [hyperparameters] for a [model] on [dataset].
  4. I want you to act as a data scientist and perform a random search of [hyperparameters] for a [algorithm] algorithm to achieve the best [metric] on [dataset].

Prompts for Data Preprocessing

  1. I want you to act as a data analyst and preprocess the [raw data] in [dataset] by removing [duplicate records] and [missing values].
  2. I want you to act as a data engineer and preprocess the [time-series data] in [dataset] by resampling it to a [lower or higher frequency].
  3. I want you to act as a data scientist and preprocess the [text data] in [dataset] by [tokenizing] it and removing [stop words] and [punctuation marks].

Prompts to Explore Data

  1. I want you to act as a data scientist and code for me. I have a dataset of [describe dataset]. Please write code for data visualisation and exploration.
  2. I want you to act as a data analyst and generate a visualization that shows the distribution of [feature] in [dataset].
  3. I want you to act as a data scientist and generate summary statistics of [feature] in [dataset].
  4. I want you to act as a data explorer and clean [dataset] by removing missing values, duplicates, and outliers.

Prompts to  Generate Data

  1. I want you to act as a fake data generator. I need a dataset that has x rows and y columns: [insert column names]
  2. I want you to act as a data generator and create a synthetic dataset with [number of features] features and [number of instances] instances.
  3. I want you to act as a data scientist and generate a time series dataset with [seasonality] seasonality and [trend] trend.
  4. I want you to act as a data simulation expert and generate a dataset that simulates [process] with [parameters] parameters.

Prompts to Address Imbalance Data

  1. I want you to act as a coder. I have trained a machine learning model on an imbalanced dataset. The predictor variable is the column [Insert column name]. In python, how do I oversample and/or undersample my data?
  2. I want you to act as a data scientist and use SMOTE to oversample the minority class of [imbalanced dataset] for classification task.
  3. I want you to act as a machine learning expert and use stratified sampling to balance the distribution of [target variable] in [dataset].
  4. I want you to act as a data engineer and apply random undersampling to address the class imbalance in [imbalanced dataset] for training a model.

Prompts for Natural Language Processing (NLP)

  1. I want you to act as a machine learning expert and build a [text classification model] that classifies [customer feedback] in [dataset] as positive or negative.
  2. I want you to act as a data scientist and analyze the [sentiment] of the [reviews] in [dataset] using [natural language processing] techniques.
  3. I want you to act as a language model researcher and develop a [language model] that can generate [text data] similar to the [training data].

Prompts for Recommender Systems

  1. I want you to act as a data scientist and develop a [content-based recommender system] that suggests [articles] based on [user interests].
  2. I want you to act as a machine learning expert and build a [collaborative filtering model] that recommends [products] to [customers] based on their [purchase history].
  3. I want you to act as a data analyst and evaluate the [accuracy] of the [recommendations] generated by the [recommender system] in [dataset].

Prompts to Train Time Series

  1. I want you to act as a data scientist and code for me. I have a time series dataset [describe dataset]. Please build a machine learning model that predict [target variable]. Please use [time range] as train and [time range] as validation.
  2. I want you to act as a time series expert and build a recurrent neural network that predicts [target variable] based on [time series data].
  3. I want you to act as a data scientist and train a seasonal ARIMA model to forecast [variable] in [time series data] using [forecast horizon] forecast periods.
  4. I want you to act as a machine learning engineer and train a long short-term memory network that detects [event] in [sensor data].

Prompts for Time Series Forecasting

  1. I want you to act as a data scientist and forecast the [sales] of [product] for the next [n months] using [time series forecasting] techniques.
  2. I want you to act as a machine learning expert and develop a [neural network model] that predicts the [stock prices] of [company] based on [historical data].
  3. I want you to act as a time series analyst and analyze the [trends and patterns] in the [weather data] of [city] using [time series decomposition] techniques.

Prompts to Visualize Data

  1. I want you to act as a coder in python. I have a dataset [name] with columns [name]. [Describe graph requirements]
  2. I want you to act as a data visualization expert and create a [type of plot] that shows the relationship between [variable1] and [variable2] in [dataset].
  3. I want you to act as a data scientist and create a [type of plot] that displays the distribution of [variable] in [dataset] and compare it across different [categorical variable].
  4. I want you to act as a data analyst and create a [type of plot] that shows the trend of [variable] over time in [dataset].
  5. I want you to act as a coder. I have a folder of images. [Describe how files are organised in directory] [Describe how you want images to be printed]

Prompts to Explain Model with Lime & Shap

  1. I want you to act as a data scientist and explain the model’s results. I have trained a [library name] model and I would like to explain the output using LIME. Please write the code.
  2. I want you to act as a machine learning specialist and use Lime to explain how a [model] made a prediction for a specific instance in [dataset].
  3. I want you to act as a data scientist and use Lime to identify the important features that contributed to the prediction of [target variable] for [model] on [dataset].
  4. I want you to act as a model explainer and use Lime to explain how a [model] handles the interaction between [features] in [dataset].
  5. I want you to act as a data scientist and explain the model’s results. I have trained a scikit-learn XGBoost model and I would like to explain the output using a series of plots with Shap. Please write the code.

Prompts to Get Feature Importance

  1. I want you to act as a data scientist and explain the model’s results. I have trained a decision tree model and I would like to find the most important features. Please write the code.
  2. I want you to act as a data scientist and use [feature selection algorithm] to calculate the feature importance of [dataset] for [target variable].
  3. I want you to act as a machine learning expert and train a [model] on [dataset] to identify the top [number] most important features for [target variable].
  4. I want you to act as a data analyst and use the permutation feature importance technique to assess the importance of [features] for predicting [target variable] in [dataset].

Prompts to Validate Column

  1. I want you to act as a data scientist. Please write code to test if that my pandas Dataframe [insert requirements here]
  2. I want you to act as a data analyst and validate the [column] in [dataset] to ensure that it contains only [valid data type].
  3. I want you to act as a data quality analyst and validate the [column] in [dataset] to ensure that it contains only [acceptable range of values].
  4. I want you to act as a data scientist and validate the [column] in [dataset] to ensure that it is not affected by [missing values] and [outliers].

Prompts to Write Multithreaded Functions

  1. I want you to act as a coder. Can you help me parallelize this code across threads in python?
  2. I want you to act as a Python developer and write a multithreaded function that can perform [task] on [input] using [number of threads] threads.
  3. I want you to act as a performance optimizer and write a multithreaded function that can parallelize the [bottleneck task] in [code section] of [Python script].
  4. I want you to act as a concurrency expert and write a multithreaded function that can asynchronously process [list of tasks] with the help of a thread pool.

Prompts to Write Regex

  1. I want you to act as a coder. Please write me a regex in python that [describe regex]
  2. I want you to act as a regex writer and write a regular expression that matches [pattern] in [text].
  3. I want you to act as a data engineer and use regex to extract [data] from [log file].
  4. I want you to act as a web scraper and write a regex that matches [pattern] in [HTML source].

Prompts to Write Unit Test

  1. I want you to act as a software developer. Please write unit tests for the function [Insert function]. The test cases are: [Insert test cases]
  2. I want you to act as a Python developer and write a unit test for the [function] in [Python script] to verify that it returns the expected output when provided with [input].
  3. I want you to act as a software engineer and write a unit test to ensure that the [web service] handles [error condition] correctly.
  4. I want you to act as a test automation engineer and write a unit test to verify that the [GUI component] updates the [UI element] correctly when the [user action] is performed.

Prompts for Writing Code

  1. I want you to act as a data scientist using R. Can you write an R script that [Insert requirement here]
  2. I want you to act as a data scientist and write SQL code for me. I have a table with two columns [Insert column names]. I would like to calculate a running average for [which value]. What is the SQL code that works for PostgreSQL 14?
  3. I want you to act as a Linux terminal expert. Please write the code to [describe requirements]
  4. Assume you are given the tables… with the columns… Output the following… `[Question from Data Lemur)
  5. I want you to act as a bot that generates Google Sheets formula. Please generate a formula that [describe requirements]
  6. I want you to act as an Excel VBA developer. Can you write a VBA that [Insert function here]?

Prompts for Explaining Code

  1. I want you to act as a code explainer. What is this code doing? [paste your code]
  2. I want you to act as a Google Sheets formula explainer. Explain the following Google Sheets command. [Insert formula]
  3. I want you to act as a data science instructor. Can you please explain to me what this SQL code is doing? [Insert SQL code]

Prompts for Optimizing Code

  1. I want you to act as a code optimizer. The code is poorly written. How do I correct it? [Insert code here]
  2. I want you to act as a software developer. Please help me improve the time complexity of the code below. [Insert code]
  3. I want you to act as a code optimizer. Can you point out what’s wrong with the following Pandas code and optimize it? [Insert code here]
  4. I want you to act as a code simplifier. Can you simplify the following code? [Insert code here]
  5. I want you to act as an SQL code optimizer. The following code is slow. Can you help me speed it up? [Insert SQL]

Prompts for Translating Code

  1. I want you to act as a code translator. Can you please convert the following code from [python] to [R]? [Insert code]
  2. I want you to act as a coder and write SQL code for MySQL. What is the equivalent of PostgreSQL’s DATE_TRUNC for MySQL?

Prompt to Write Documentation

  1. I want you to act as a software developer. Please provide documentation for function below. [Insert function]

Prompt to Improve Readability

  1. I want you to act as a code analyzer. Can you improve the following code for readability and maintainability? [Insert code]

Prompts to Format SQL

  1. I want you to act as a SQL formatter. Please format the following SQL code. Please convert all reserved keywords to uppercase [Insert requirements]. [Insert Code]

Prompts to Explain Concepts

  1. I want you to act as a data science instructor. Explain [concept] to a five-year-old.
  2. I want you to act as a data science instructor. Explain [concept] to an undergraduate.
  3. I want you to act as a data science instructor. Explain [concept] to a professor.
  4. I want you to act as a data science instructor. Explain [concept] to a business stakeholder.
  5. I want you to act as an answerer on StackOverflow. You can provide code snippets, sample tables and outputs to support your answer. [Insert technical question]

 

Prompts for Suggesting Ideas

Suggest Ab Testing Steps

  1. I want you to act as a statistician. [Describe context] Please design an A/B test for this purpose. Please include the concrete steps on which statistical test I should run.

Suggest Dataset

  1. I want you to act as a data science career coach. I want to build a predictive model for […]. At the same time, I would like to showcase my knowledge in […]. Can you please suggest the five most relevant datasets for my use case?

Suggest Edge Cases

  1. I want you to act as a software developer. Please help me catch edge cases for this function [insert function]

Suggest Feature Engineering

  1. I want you to act as a data scientist and perform feature engineering. I am working on a model that predicts [insert feature name]. There are columns: [Describe columns]. Can you suggest features that we can engineer for this machine learning problem?

Suggest Portfolio Ideas

  1. I want you to act as a data science coach. My background is in […] and I would like to [career goal]. I need to build a portfolio of data science projects that will help me land a role in […] as a […]. Can you suggest five specific portfolio projects that will showcase my expertise in […] and are of relevance to [company]?

Suggest Resources

  1. I want you to act as a data science coach. I would like to learn about [topic]. Please suggest 3 best specific resources. You can include [specify resource type]

Suggest Time Complexity

  1. I want you to act as a software developer. Please compare the time complexity of the two algorithms below. [Insert two functions]

Career Coaching

  1. I want you to act as a career advisor. I am looking for a role as a [role name]. My background is […]. How do I land the role and with what resources exactly in 6 months?

 

Prompts for Troubleshooting Problem

Correct Python Code

  1. I want you to act as a software developer. This code is supposed to [expected function]. Please help me debug this python code that cannot be run. [Insert function]

Correct Own Chatgpt Code

  1. Your above code is wrong. [Point out what is wrong]. Can you try again?

Correct SQL Code

  1. I want you to act as a SQL code corrector. This code does not run in [your DBMS, e.g. PostgreSQL]. Can you correct it for me? [SQL code here]

Troubleshoot PowerBI Model

  1. I want you to act as a PowerBl modeler. Here is the details of my current project. [Insert details]. Do you see any problems with the table?

If you’re a data scientist looking to stay ahead of the curve, it’s definitely worth exploring the many ways in which ChatGPT can help you achieve your goals and advance your career.

Some Prompts are from GitHub


Share the Knowledge

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top