These options control how columns are converted between data types (factors, variates and texts) and how data is added to columns.
- To access this dialog select Tools | Spreadsheet Options then click the Conversions tab.
- Set options as required then click OK to close the dialog.
Sort factor levels on convert | When converting text or variate columns into factors, this option will sort the numeric levels (or text labels) into ascending order. |
Suggest filling in missing factor levels |
Selecting this option opens a dialog when a factor column has a large number of missing values and is being read from a foreign data source such as Excel. The dialog lets you carry out a fill down action to copy values into cells with missing values. |
Allow new factor levels in edit | This lets you add new factor levels/labels to a spreadsheet by typing them directly into a cell and overwriting the original contents. Entering a level or label that doesn’t exist in the factor will open up a dialog querying if you want to add a new level/label to the factor, or change the entered value to the closest matching level/label. |
Suggest converting columns with <= unique items to factors |
Selecting this option opens a dialog when a column has repeated values, but has fewer unique values than the number specified here. The dialog lets you convert the columns to factors. |
Tolerance on assigning values to nearest level |
When you enter a numerical value into cell in a factor column, if the number is not an existing factor level, it will be assigned to the closest existing level, provided the difference between the entered number and that level is less than the provided tolerance (which must be positive). A value of 0 will always raise an error on any mismatch. For example, you have existing factor levels 10, 20 and 30 and a tolerance of 1. Entering 9 or 11 is within tolerance of level 10, so Genstat will convert your value to this level. Any other values close to 10 will result in the error message below. |
Tolerance on creating levels from variate |
A positive value in this field will be used as a tolerance in creating levels from a factor. If you enter two values that differ by less than this tolerance, then they will be assigned to the same factor group. The default value of 0 places all unique values into their own group. If the variate being converted to a factor contains calculated values, then these values may differ in their final few significant figures. For example, the calculated values 1/3 and 200/600 may be assigned to different groups depending on how the final figure rounds. A small value such as 1e-12 could then be used to allocate these two distinct values to the same group. |
Remove unused factor levels on update |
If selected, when updating Genstat, any factor levels or labels that are not used will be removed. For example, if a factor column Weekdays does not contain an entry for Tuesday, the label Tuesday will be removed from the set of labels when you next update Genstat. This option is only applied once, either when a column is new or is imported from a foreign source. |
Use standard factor orders | When selected, if all labels in a factor match either the compass points, weekday or month names, or the User defined label order (below), then these labels will be sorted into the order of the matching list |
User defined label order |
This opens the User Label Sort Order dialog, which allows a list of labels to be entered. If the labels in a factor match the user defined labels, then they will be sorted into the order in this list. Multiple orders may be entered by providing these in sections. For example a colour order and a species order can be entered by providing the list of labels: Black, Blue, Green, Red, Yellow, White, followed by Clover, Lotus, Ryegrass, Cocksfoot. |
Ignore case | Specifies that individual lower and upper case characters in text columns will be ignored during certain operations, such as sort, search, and convert to factors. In other words, case will not affect these operation. |
Ignore blanks | Leading and trailing blank characters in text columns will be ignored during certain operations, such as sorting and converting to factors. |
Check for date values | When selected, Genstat checks all new text columns to see if they contain data in date format. If columns appear to contain dates the Convert Text to Date dialog will prompt you to convert these to dates. Check for date values can also be enabled/disabled when you paste columns from the clipboard, merge multiple files or append multiple files. |
Text to number conversions |
Sometimes numbers are accidentally mistyped as letters. Common mistypings include i, I, l, L for the number 1; o, O for the number 0; a comma for a decimal point, etc. Text to number enables Genstat to convert text labels by substituting numbers for letters or, where this is not possible, the label is converted to a missing value.
|
Maximum text length | The maximum text length in Genstat is a user-defined limit between 90 and 10,000. You can select a limit from the dropdown list or type your own value. Texts that are read from any data source which are longer than the specified limit will be truncated. |
Unicode in Excel XLSX files |
When reading an Excel xlsx file, this option controls what happens to Unicode characters that are not supported by the extended ASCII character set. Some Unicode characters, such as Greek or mathematical symbols, can be displayed using the Genstat typesetting commands used by the PRINT directive. However, many foreign language Unicode characters cannot be displayed by Genstat and these will be removed with a warning. The options below control how the supported Unicode characters are handled.
The conversion of Greek letters to ASCII characters is given in the table below. |
Auto extend on pasting cells | When pasting data from the clipboard new rows or columns will be added to the sheet as required. If this option is not enabled clipboard data will be truncated at the current row and column extents. |
Auto delete on cut | When data are cut from a spreadsheet they are replaced by missing values by default. If this option is enabled, Genstat will delete entire rows or columns that are cut, rather than filling the cells with missing values. |
Calculations |
When a variate created from a calculation is converted to a factor, re-calculating the column may generate some invalid factor levels. For example, if the column C1 contains the values 1, 2, 3 and 4 and the column C2 is calculated as C2 = 2*C1, this will produce the values 2, 4, 6 and 8. After converting C2 to a factor, if a value is changed in C1 to 5 this will create a recalculated value of 10 which is not a valid factor level (2, 4, 6 or 8) of C2. This item controls what happens when converting the variate to a factor:
|
Defaults | Resets the spreadsheet options on this tab to their default values. |