How to manage variables in STATA?

Data entered in STATA can be classified either as numeric or string type. Associated with each type of data is its storage type i.e. the numbers are stored as byte, int, long, float, or double. STATA takes “float” as the default storage type for its variables. Similarly, byte, int, and long are usually used to hold integers. The table given below defines the storage type with minimum and maximum value for each variable along with byte size.

Storage TypeMinimumMaximumClosest to 0 without being 0Bytes
byte-127100+/-11
int-3276732740+/-12
long-2,147,483,6472,147,483,620+/-14
float-1.70141173319x 10381.70141173319 x1038+/-10-384
double-8.9884656743×103078.9884656743 x 10307+/-10-3238

In the case of strings, they are stored at str#, i.e. str1, str2, str3………….. str2045 or strL. Here str5 indicates its length.

For example, the male would be str4 (since the word ‘male’ has 4 characters) and females would be str6 (since the word ‘female’ has 6 characters).

Note that since STATA stores information in the memory, the storage space should be used judiciously.

For example, the string female with a length of 6 would waste memory if stored as str20. Similarly, the “byte” size of numeric value would waste storage if it is saved as “double”.

Converting a string to numeric

Once the data is entered into STATA, it automatically defines the storage type of the data. However, sometimes the variables which have numerical value is taken as a string variable. Since a string cannot be used for analysis, it should be converted to numeric.

Check the data set in “Variable Window” (See below), wherein the Name, Label, Type and Format of the variable are defined.

For example: in this case since the var “make” is registered as str18 and in order to compute it as numeric, it has to be converted into a numeric.

Example of STATA variable in list
Variable list in STATA

To convert, use the command “destring ”. This command is used to convert string variables into numeric variables and vice versa. Strings can be converted into numeric in two ways.

  1. Replace the string variable.
  2. Create a new variable is numeric.

In order to replace “make”, we will use the command:

destring make, replace

To generate a new var, use the command:

destring make, gen(make1)

where make1 is the new variable in the numeric form.

Other commands which can be used are:

  1. compress to compress the memory.
  2. destring to convert string to numeric and vice a versa.
  3. format to set output format.
  4. recast to change storage type.

Working with Variable Manager

On the main Stata window, click on “variable manager” to manage variables.

Variable manager icon can be used to manage the variables included
Variable manager tab in STATA main window

A new window will open,

Variable manager in STATA can be used to manage the variables ( label, format type)
Variable list in STATA

For each variable, the properties are defined on the right-hand side:

  • name,
  • label,
  • type,
  • format,
  • value label and
  • notes (if any) can be added.

In the case of categorical variables define values by clicking on “Manage”.

Variables name and format can be checked and edited in this window
Variable Manager window in STATA

Click on “Add Value” to add codes to each sub-category of the variable.

For example: to add information about gender, click “Add Value”. A new tab will open to define the value 1 for Male and 2 for females.

However new variables cannot be added to this window. It can only be added to the “Data Editor” (Edit) window. See the image below for the data editor window:

one can add data using the data editor ( edit) option
Data editor window in STATA

As shown in the figure above, one can manage (modified individually) by clicking on “Variable Properties Icon”.

Discuss