The Student Room Group

quick matlab question

Hi,
I've imported a dataset into matlab, and 3 of the columns of data have non numeric data.
I've converted non numeric data into categorical via categorical() function.

I now need to convert the categorical data (in the dataset) into numeric data. Any ideas on what commands to use here?

I've used the grp2idx function, but this is useless to me as it ignores the whole dataset and just converts the variable I've told it to. I want it to convert that variable to numeric and display in the whole dataset.



Thanks :smile:
Reply 1
Original post by darkgreenleaf
Hi,
I've imported a dataset into matlab, and 3 of the columns of data have non numeric data.
I've converted non numeric data into categorical via categorical() function.

I now need to convert the categorical data (in the dataset) into numeric data. Any ideas on what commands to use here?

I've used the grp2idx function, but this is useless to me as it ignores the whole dataset and just converts the variable I've told it to. I want it to convert that variable to numeric and display in the whole dataset.



Thanks :smile:


Not 100% sure of what you mean, and not 100% sure what type of object your dataset is in matlab, but if its only 3 columns and grp2idx works only on a column, couldnt you do something like
data(:,5) = grp2idx(data(:,5))
for the 5th column, and repeat for the other two columns. Obviously a bit of wrangling may be required depending on the type of data in matlab.
Reply 2
Original post by mqb2766
Not 100% sure of what you mean, and not 100% sure what type of object your dataset is in matlab, but if its only 3 columns and grp2idx works only on a column, couldnt you do something like
data(:,5) = grp2idx(data(:,5))
for the 5th column, and repeat for the other two columns. Obviously a bit of wrangling may be required depending on the type of data in matlab.


Hi, thanks for your answer.

So I have no problem doing that, but then it provides me with the data for the 3 columns only and individually.

I want the numeric data for those 3 coloums together with the rest of my dataset if that makes sense?

Like, I want the numeric data and all the rest of my dataset to stay combined in 1 dataset.
Reply 3
Original post by darkgreenleaf
Hi, thanks for your answer.

So I have no problem doing that, but then it provides me with the data for the 3 columns only and individually.

I want the numeric data for those 3 coloums together with the rest of my dataset if that makes sense?

Like, I want the numeric data and all the rest of my dataset to stay combined in 1 dataset.

By dataset I take it you mean numeric matrix? If so, the index column is assigned back to the relevant column of your matrix as in the previous post. It would help to say what the original type of your data set is, etc. Maybe include the first few rows?
(edited 11 months ago)
Reply 4
Original post by mqb2766
By dataset I take it you mean numeric matrix? If so, the index column is assigned back to the relevant column of your matrix as in the previous post. It would help to say what the original type of your data set is, etc. Maybe include the first few rows?


So the 'dataset' is an excel sheet with loads of data about car specifications. For example, car make & model, transmission, engine size, miles per gallon, city miles per gallon, fuel consumption etc...

So there's string/ categorical data within there like Transmission ( 5 speed manual or 6 speed auto etc) and Fuel Type (diesel or petrol), I've converted these into categorical but struggling to convert into numerical.

I'd like them converted and to be presented with the rest of the dataset together showing all of the car stats.

I can provide some data in the next post


This is what my coursework question is
4. Copy the dataset table into a new table, name it dataset_normalised, and apply the following: convert all non-numeric attributes to categorical data attributes and, subsequently, onto numerical data. Afterwards, using the z-score, normalise the following attributes: “Transmission, FuelType1, CityMPG, HighwayMPG, UnadjustedCityMPG, UnadjustedHighwayMPG, 'CombinedMPG, AnnualFuelCost”.
(edited 11 months ago)
Reply 5
Original post by darkgreenleaf
So the 'dataset' is an excel sheet with loads of data about car specifications. For example, car make & model, transmission, engine size, miles per gallon, city miles per gallon, fuel consumption etc...

So there's string/ categorical data within there like Transmission ( 5 speed manual or 6 speed auto etc) and Fuel Type (diesel or petrol), I've converted these into categorical but struggling to convert into numerical.

I'd like them converted and to be presented with the rest of the dataset together showing all of the car stats.

I can provide some data in the next post


This is what my coursework question is
4. Copy the dataset table into a new table, name it dataset_normalised, and apply the following: convert all non-numeric attributes to categorical data attributes and, subsequently, onto numerical data. Afterwards, using the z-score, normalise the following attributes: “Transmission, FuelType1, CityMPG, HighwayMPG, UnadjustedCityMPG, UnadjustedHighwayMPG, 'CombinedMPG, AnnualFuelCost”.


It would really help to see what matlab commands youve done and which objects exist in the workspace and what are their types etc.
(edited 11 months ago)
Reply 6
Original post by mqb2766
It would really help to see what matlab commands youve done and which objects exist in the workspace and what are their types etc.

Hi so


dataset.Transmission = categorical(dataset.Transmission);
dataset.FuelType1 = categorical(dataset.FuelType1);

transm = grp2idx(dataset.Transmission);
fuel = grp2idx(dataset.FuelType1);


That's my code, it works but it produces 2 things in the workspace, a vector of numeric data on Fuel Type 1 & Transmission.

Screenshot (713).pngScreenshot (714).pngScreenshot (715).png

That's what I'm getting
I want a command that will put the individual columns of data into the whole dataset
Reply 7
By dataset Im guessing you mean a matrix? As each vector (should) has the same number of rows, something like
data = [transm fuel]
should work?
https://www.mathworks.com/help/matlab/math/creating-and-concatenating-matrices.html
You can overwrite columns etc using the slice ":" as per #2 etc.
Reply 8
Original post by mqb2766
By dataset Im guessing you mean a matrix? As each vector (should) has the same number of rows, something like
data = [transm fuel]
should work?
https://www.mathworks.com/help/matlab/math/creating-and-concatenating-matrices.html
You can overwrite columns etc using the slice ":" as per #2 etc.


Hi, so that combines 'transm' and 'fuel' into 1 matrix together, but I want them to apart of the whole matrix if that makes sense?
See image below, I'm not sure if its possible.

Screenshot (720).png


I'm very new to coding and MATLAB, sorry, I have very limited knowledge.
All I need to do is get the numeric and original data into the same table so I can normalise them as indicted in my work instructions 'Afterwards, using the z-score, normalise the following attributes: “Transmission, FuelType1,CityMPG .......


Apologies to be such a dummy
Reply 9
Original post by darkgreenleaf
Hi, so that combines 'transm' and 'fuel' into 1 matrix together, but I want them to apart of the whole matrix if that makes sense?
See image below, I'm not sure if its possible.

Screenshot (720).png


I'm very new to coding and MATLAB, sorry, I have very limited knowledge.
All I need to do is get the numeric and original data into the same table so I can normalise them as indicted in my work instructions 'Afterwards, using the z-score, normalise the following attributes: “Transmission, FuelType1,CityMPG .......


Apologies to be such a dummy

What is the datatype of the object dataset? So when you type
whos
in the command window, what does it say?
Reply 10
Screenshot (722).png

I have done a bunch of other pre prosessing steps so there may be a few more variables on there which you may not expect
Reply 11
I sholud have spotted its a table variable at the top of the workspace browser. So if you did
dataset.transm = grp2idx(dataset.Transmission);
dataset.fuel = grp2idx(dataset.FuelType1);
does that add two columns onto your talbe?
https://www.mathworks.com/help/matlab/tables.html
Reply 12
Original post by mqb2766
I sholud have spotted its a table variable at the top of the workspace browser. So if you did
dataset.transm = grp2idx(dataset.Transmission);
dataset.fuel = grp2idx(dataset.FuelType1);
does that add two columns onto your talbe?
https://www.mathworks.com/help/matlab/tables.html

:eek::eek::eek:
You have no idea how much you just helped me!!!!!
It had taken me well over 10 hours to work out how to do this

Thanks so much for your help, you are an absolute legend!


can I be cheeky and ask if I need anymore help can I post on this thread for help?

Thanks again!!! :smile:
Reply 13
Original post by darkgreenleaf
:eek::eek::eek:
You have no idea how much you just helped me!!!!!
It had taken me well over 10 hours to work out how to do this

Thanks so much for your help, you are an absolute legend!


can I be cheeky and ask if I need anymore help can I post on this thread for help?

Thanks again!!! :smile:


Sure, but post what youve tried first as its obviously your coursework. Posting code/variable listing/exemplar data is also very useful.
Reply 14
(Original post by mqb2766)Sure, but post what youve tried first as its obviously your coursework. Posting code/variable listing/exemplar data is also very useful.

Absolutely, thank you

Quick Reply

Latest

Trending

Trending