cleanser



Show raw data
                    

            

Missing data

The barplot on the left hand side shows the amount of missing values in each variable.
The aggregation plot on the right hand side shows all existing combinations of missing and non-missing values in the dataset observations.
Non-missing values are colored in blue.
Missing values are color in red.
Additionally, there are horizontal barplots on the right side of the graphic that indicates how frequent are the combinations in the dataset.

Remove columns

In the following table, users has the possibility to choose which columns he wants to keep for his final output.
3 bouttons are availables:
reset leads to come back to the original data set.
inverse selection leads to select columns that were not selected previously.
delete columns leads to delete variables.

Summary table for the quantitative variables

Summary table for the qualitative and date variables

Data validation

This tab allows you to quickly validate your data using simple rules. You can enter rules in the text area below. Write one rule per line.
Suppose that the name of the column that you want to check is weigth and that you want to highlight all rows where weight.is greater than 5. This can be done like so:

weigth > 5

Rows where weight is greater than 5 will be highlighted in the table below.

If you want to test several rules, you need to write one rule per line:

weight > 5
size < 10.8
Export offending rows

Data correction

You can use the text box below to enter simple modification rules to apply to your data.
As an example suppose you are working with data on wages, and you have to topcode the yearly earnings to 60000 euro. This can be done like so:

if (yr_earnings >= 60000) yr_earnings <- 60000

It is also possible to set the offending values to NA, and then impute these values using the next tab of the app.

Save to file