
machine learning - How to train ML model with multiple variables ...
Oct 2, 2018 · You can either build say 3 separate models (or 9, if you want all 9 permutations of city vs store), but the more conventional way might be one hot coding your string based features into numeric value into 1 model. You just supply the encoded city and store along with your units_sold and num_employees as part of the input to get your prediction.
Best practices to store Python machine learning models
What are the best practices to save, store, and share machine learning models? In Python, we generally store the binary representation of the model, using pickle or joblib.
python - How to apply machine learning model to new dataset
I'm very new to machine learning & python in general and I'm trying to apply a Decision Tree Classifier to my dataset that I'm working on. I would like to use this model to predict the outcome after training it with certain cellular features. The training data consists of a results column, describing either a living/dead cell as 1 and 0 ...
Python vs R for machine learning - Data Science Stack Exchange
Jun 12, 2014 · I'm just starting to develop a machine learning application for academic purposes. I'm currently using R and training myself in it. However, in a lot of places, I have seen people using Python. Wh...
python - What do x and y mean when working with data in the …
Dec 14, 2021 · We need train & test data, so what do x and y mean? Does it mean that it divides 15% to the 'x_train', 'x_val', 'y_train', 'y_val' ? x_train, x_val, y_train, y_val = train_test_split(x, y, test...
Do I need to convert booleans to ints to enter them in a machine ...
Dec 11, 2018 · My dataset contains a lot of columns with booleans do I really need to change them so I can insert them into the algorithm? I'm gonna use KNN right now but will test other algorithms later so I'm ...
machine learning - strings as features in decision tree/random …
In most of the well-established machine learning systems, categorical variables are handled naturally. For example in R you would use factors, in WEKA you would use nominal variables. This is not the case in scikit-learn. The decision trees implemented in scikit-learn uses only numerical features and these features are interpreted always as continuous numeric variables. Thus, simply replacing ...
Should you use random state or random seed in machine learning …
Jul 22, 2020 · I'm starting to study machine learning. All the examples I saw, the person that created the ML model used a random state or a random seed to stop the randomness of the process. But, in real life, when you're trying to apply a machine learning model into an actual project of a company, should you use any random state or seed?
python - Converting dates to appropriate form to train machine …
Jan 29, 2019 · I am trying apply linear regression on a dataset where the independent variable is a date formatted like '2-jan-08'. How should I convert the date so it can be used for model fitting?
python - Machine Learning methods for finding outliers - Data …
Jan 12, 2020 · I want to detect anomalies or outliers inside this csv file. Being relatively new to this and not knowing much I would appreciate if I get some help or a small guidance and how I should tackle this problem. What methods or what is the best approach to find outliers in a Dataframe. My language of choice is Python