Variable management in STATA

Data entered in STATA can be classified either as numeric or string type. Associated with each type of data is its storage type i.e. the numbers are stored as byte, int, long, float, or double. STATA takes “float” as the default storage type for its variables. Similarly byte, int and long are usually used to hold integers. The table given below defines the storage type with minimum and maximum value for each variable along with byte size.

Storage Type Minimum Maximum Closest to 0 without being 0 Bytes
byte -127 100 +/-1 1
int -32767 32740 +/-1 2
long -2,147,483,647 2,147,483,620 +/-1 4
float -1.70141173319x 1038 1.70141173319 x1038 +/-10-38 4
double -8.9884656743×10307 8.9884656743 x 10307 +/-10-323 8

In case of strings, they are stored at str#, i.e. str1, str2, str3………….. str2045 or strL. Here str5 indicates its length.

For example male would be str4 (since the word ‘male’ has 4 character) and female would be str6 (since the word ‘female’ has 6 character).

Note that since STATA stores information in the memory, the storage space should be used judiciously.

For example: the string female with a length of 6 would waste memory if stored as str20. Similarly, “byte” size of numeric value would waste storage if it is saved as “double”.

Converting string to numeric

Once the data is entered into STATA, it automatically defines the storage type of the data. However sometimes the variables which has numerical value is taken as string variable. Since string cannot be used for analysis, it should be converted to numeric.

Check the data set in “Variable Window” (See below), wherein the Name, Label, Type and Format of the variable are defined.

For example: in this case since the var “make” is registered as str18 and in order to compute it as numeric, it has to be converted into a numeric.

Example of STATA variable in list

Variable list in STATA

To convert, use the command “destring ”. This command is used to convert string variables into numeric variables and vice a versa. Strings can be converted into numeric in two ways.

  1. Replace the string variable.
  2. Create a new variable in numeric.

In order to replace “make”, we will use the command:

destring make, replace

To generate a new var, use the command:

destring make, gen(make1)

where make1 is the new variable in the numeric form.

Other commands which can be used are:

  1. compress to compress the memory.
  2. destring to convert string to numeric and vice a versa.
  3. format to set output format.
  4. recast to change storage type.

Working with Variable Manager

On the main Stata window, click on “variable manager” to manage variables.

Variable manager icon can be used to manage the variables included

Variable manager tab in STATA main window

A new window will open,

Variable manager in STATA can be used to manage the variables ( label, format type)

Variable list in STATA

For each variable, the properties are defined on the right hand side:

  • name,
  • label,
  • type,
  • format,
  • value label and
  • notes (if any) can be added.

In case of categorical variables define values by clicking on “Manage”.

Variables name and format can be checked and edited in this window

Variable Manager window in STATA

Click on “Add Value” to add codes to each sub-category of the variable.

For example: to add information about gender, click “Add Value”. A new tab will open to define the value 1 for Male and 2 for female.

However new variables cannot be added in the this window. It can only be added in “Data Editor” (Edit) window. See image below for data editor window:

one can add data using the data editor ( edit) option

Data editor window in STATA

As shown in the figure above, one can mange (modified individually) by clicking on “Variable Properties Icon”.

Shruti Datt

Shruti Datt

Project Handler at Project Guru
Shruti is B-Tech & M-Tech in Biotechnology. Some of her strengths include, Good interpersonal skills, eye for detail, well devised analytical and decision making skills and a positive attitude towards life. Her aim in life is to obtain a responsible and challenging position where her education and work experience will have valuable application.
She is a true Piscean. She loves doing things to perfection with passion. She is very creative and likes to make personalized gifts for her dear ones, this is actually something that keeps her going. Shruti loves adventure sports and likes river rafting and cliff jumping.
Shruti Datt

Related articles

  • Correlation analysis using STATA Correlation analysis is conducted to examine the relationship between dependent and independent variables. There are two types of correlation analysis in STATA.
  • Introduction to STATA STATA, like SPSS is a smart data analysis tool used for data management and analysis. It is a fast and easy to use, across all operating systems such as Windows, Unix and Mac.
  • Solution for non-stationarity in time series analysis in STATA The previous article based on the Dickey Fuller test established that GDP time series data is non-stationary. This prevented time series analysis from proceeding further. Therefore, in this article possible solution to non-stationarity is explained.
  • Setting the ‘Time variable’ for time series analysis in STATA Time series analysis works on all structures of data. It comprises of methods to extract meaningful statistics and characteristics of data. Time series test is applicable on datasets arranged periodically (yearly, quarterly, weekly or daily).
  • Prediction and forecasting using ARIMA in STATA After performing Autoregressive Integrated Moving Average (ARIMA) modelling in the previous article: ARIMA modeling for time series analysis in STATA, the time series GDP can be modelled through ARIMA (9, 2, 1) .

Discuss

We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.