Select Page

Data Import and Export:

Data Import:

  1. CSV (Comma Separated Values):
    • R: Use read.csv() function.
    • Python (Pandas): Use pandas.read_csv().
  2. Excel Files:
    • R: Use read_excel() from packages like readxl or openxlsx.
    • Python (Pandas): Use pandas.read_excel().
  3. Text Files (txt):
    • R: Use readLines() for plain text or read.table() for structured text files.
    • Python: Use open() or libraries like pandas for structured data.
  4. JSON (JavaScript Object Notation):
    • R: Use fromJSON() from the jsonlite package.
    • Python: Use json.loads() or libraries like pandas or json module.

Data Export:

  1. CSV (Comma Separated Values):
    • R: Use write.csv() or write.csv2() for international usage.
    • Python (Pandas): Use to_csv().
  2. Excel Files:
    • R: Use packages like writexl or openxlsx.
    • Python (Pandas): Use to_excel().
  3. Text Files (txt):
    • R: Use writeLines() for plain text or write.table() for structured text files.
    • Python: Use open() or libraries like pandas for structured data.
  4. JSON (JavaScript Object Notation):
    • R: Use toJSON() from the jsonlite package.
    • Python: Use json.dump() or libraries like pandas or json module.

Attributes and Data Types:

Attributes:

  1. Nominal Attribute:
    • Categories with no order or ranking (e.g., colors, types).
  2. Ordinal Attribute:
    • Categories with a specific order or ranking (e.g., low, medium, high).
  3. Interval Attribute:
    • Data with a consistent interval between values, but no true zero point (e.g., temperature in Celsius).
  4. Ratio Attribute:
    • Data with a consistent interval between values and a true zero point (e.g., height, weight).

Data Types:

  1. Numeric (Continuous) Data Types:
    • Integer: Whole numbers (e.g., 1, 2, -3).
    • Float (or Double): Numbers with decimals (e.g., 1.5, -0.003).
  2. Categorical (Discrete) Data Types:
    • Character/String: Text data (e.g., “hello”, “category A”).
    • Factor: Categorical data with predefined levels or categories.
  3. Boolean Data Type:
    • Represents true or false values (e.g., TRUE, FALSE).
  4. Date and Time Data Types:
    • Date: Represents calendar dates (e.g., “2022-09-27”).
    • Time: Represents time of day (e.g., “14:30:00”).
    • DateTime (or Timestamp): Represents both date and time.
  5. Complex Data Types:
    • Some languages have more complex data types like lists, dictionaries, or data frames.

Understanding these concepts helps in correctly handling and analyzing data, as different types of data may require different processing and visualization techniques. Additionally, it’s essential for data preprocessing and feature engineering when building machine learning models.