Deep Dive Articles

Upload File or Folder

Understanding Upload File or Folder

The OvalEdge "Upload a File or Folder" tool enables easy file transfer from a user's device to the application using a Network File System (NFS) connection. This process enhances file management.

After a successful upload, users can catalog and organize digital assets within OvalEdge. The uploaded files and folders appear in the File Manager, and administrators can use Cataloging to organize them in the Data Catalog.

The upload feature securely creates backups of files by storing files on a remote server. This ensures data integrity and protects against local device failures or data loss. Additionally, it promotes collaboration by letting users share files over a network.

Navigating Upload File or Folder

Author Users in the OvalEdge application can access the "Upload File or Folder" tool through the Advanced Tools module. This tool is organized into three main tabs, each representing a step in the file or folder upload process:

  • Select Data Lake: The first tab where users choose the destination Data Lake for their upload.
  • Select Your Directory: The second tab is where users specify the directory or folder from their local device that they want to upload.
  • Upload File or Folder: The final tab is where users initiate the actual upload process for the selected file or folder.

These tabs guide users through the sequential steps of the upload process, making it straightforward and user-friendly.

Select Data Lake

The Data Lake serves as a centralized storage hub for structured and unstructured data in OvalEdge. On the "Select Data Lake" page, users can explore available file connections and search for different connection types. This allows users to easily locate and choose the appropriate Data Lake for their specific needs.

To start the upload process, users can select a specific connection name, such as "NFS - A," from the available list to indicate the storage location for their data. The chosen connector type for this task is NFS (Network File System). After making this selection, Admins move to the next tab, "Select Your Directory," where they can precisely identify the directory from which they want to import their file. This step enhances the accuracy and efficiency of the data transfer process.

Select Your Directory

Inside the Data Lake, there are several directories specifically set up for the uploading of data files. This provides a structured and organized method for managing data. Furthermore, users have the flexibility to simplify the process by directly uploading files into the Data Lake, offering an even more streamlined approach to data management.

In the 'Select Your Directory' tab of the Upload File or Folder tool, users can choose to assign an existing directory linked to the selected NFS Connector for uploading files or folders. This allows a seamless alignment with particular data upload requirements.

Alternatively, users also have the flexibility to create a new directory using the 'Create Directory' option in the Nine dots menu. By providing a distinct name for the newly created directory and clicking the 'Next' button, users proceed to the third and final tab, 'Upload File or Folder'.

Upload File or Folder

In the 'Upload File or Folder' tab, users are given the option to choose between uploading a single file or an entire folder. This flexibility caters to the specific requirements of the data transfer, allowing the user to select the most suitable method for their needs.

It is important to be aware of the maximum file size allowed for upload, which is limited to 2 MB. This restriction is in place to ensure an optimal upload experience while preserving OvalEdge efficiency. Additionally, users should pay attention to the configured file types (such as csv, conf, env, and others) to ensure compatibility with the system.

Users can upload files or folders from their local device by using the "Select from your computer" button and choosing the desired files or folders. This action prompts users to locate and select the specific file they intend to upload.

After the upload process is complete, users can view their files or folders in the File Manager module of the OvalEdge application. The List View offers a detailed summary of all uploaded files and folders, organized across various directories.

To learn more about File Manager, please visit File Manager | Deep Dive Article

OvalEdge-Supported Data Formats

The data types supported by OvalEdge Upload File or Folder tool are listed below:

File Extension

Format Supported

Description of Format

.csv

Values Separated by Comma

CSV (Comma-Separated Values) is a file format that stores tabular data in plain text. In a CSV file, each line represents a data record, and each record comprises one or more fields separated by commas. The file extensions commonly used for CSV files are .csv and .txt.

.conf

Plain Text

A CONF (Configuration) file is stored in plain text format. CONF files are commonly used as configuration files in Unix and Linux systems.

.ddl

Plain Text

A DDL (Data Definition Language) file is created in the Data Definition Language, which is used for describing database schemas. It is saved in plain text format and contains commands such as CREATE, USE, ALTER, and DROP for defining and managing database structures.

.env

Key-Value Pair

The ".env" file extension is commonly associated with environment configuration files. These files often store configuration settings and sensitive information for applications. The content of a ".env" file typically consists of key-value pairs, where each pair represents a configuration variable and its corresponding value.

.gz

Gzip

The '.gz' file extension is commonly associated with files that have been compressed using the gzip compression algorithm. This compression reduces the original file's size, making it more storage-efficient and facilitating faster transmission over a network.

.hql

Hive CLI 

Hive Query Language (HQL) scripts are typically saved with the ".hql" file extension. These scripts contain queries and commands that are used to interact with data stored in Hadoop through the Hive platform. Users can define tables, run queries, and perform various data manipulation tasks using HQL.

.parquet

Apache Parquet

Parquet, part of the Apache Hadoop ecosystem, is a free and open-source column-oriented data storage format. Similar to other columnar-storage file formats in Hadoop, such as RCFile and ORC, Parquet is designed for efficient storage and processing of large datasets.

.json

JavaScript Object Notation

A JSON (JavaScript Object Notation) file stores simple data structures and objects in the JSON format, a widely adopted standard for data interchange. Primarily used for transmitting data between web applications and servers, JSON files facilitate easy representation and parsing of data.

Pipe Delimited File

Pipe Delimited ( | )

A Pipe Delimited File, a type of delimited text file, is used to store data where each line represents a single entity (e.g., a book or company). Fields within each line are separated by a specific delimiter, often a pipe ('|'). This format offers flexibility by allowing field values of varying lengths, in contrast to flat files that use fixed-width spaces for each field

.properties

Plain Text 

A properties file in the context of Minecraft is a plain text file used to store configuration information for the server. This file is saved in a human-readable format, making it accessible for users to manage and customize server settings.

.sql

Text

The '.sql' file extension indicates a Structured Query Language (SQL) Data File. This plain text file stores SQL statements used for creating or modifying database structures, as well as performing operations such as insertions, updates, deletions, or other SQL transactions.

.sh

Text 

A file with the '.sh' extension is a script designed for the Unix shell. It comprises instructions written in the Bash scripting language and can be executed by typing a text command in the shell environment.

.xls

Microsoft Office Excel

An XLS file, associated with Microsoft Office Excel, contains rows and columns of cells. Each cell can include various data types such as words, numbers, or formulas that dynamically solve equations. XLS spreadsheets may also feature tables and charts visualizing selected data sections.

.xlsx

XML Microsoft Office Excel

A file with the .xlsx file extension is a Microsoft Excel Open XML Spreadsheet (XLSX) file created by Microsoft Excel.

.tsv

Tab Separated Values

A tab-separated values file is a simple text format for storing data in a tabular structure, e.g., database table or spreadsheet data, and a way of exchanging information between databases. Each record in the table is one line of the text file.

.txt

Text

A txt file is a standard text document that contains plain text. 

.yaml

Text

It is used for reading and writing data independent of a specific programming language

.orc

Apache ORC

Optimized Row Columnar (ORC) files are commonly used in the Hadoop ecosystem, particularly with Apache Hive, as a storage format for structured data. They are designed to optimize query performance and reduce storage requirements in distributed data processing environments.

.avro

Apache Avro

AVRO is a data serialization framework that facilitates the efficient exchange of data between systems. It is designed to be fast, compact, and versatile. AVRO supports a schema-based approach, allowing data to be self-describing, and it provides features like dynamic typing and schema evolution, making it suitable for scenarios where data structures may evolve.

.class

Java Source Code

The ".class" file extension represents compiled Java classes, containing bytecode that is executed by the Java Virtual Machine.

.zip

ZIP

ZIP is a popular archive file format that is widely used for compressing and packaging files and directories. It is a widely supported and versatile format, making it a standard choice for file compression and distribution.

.html

.html

HTML, or Hypertext Markup Language, is the standard language for creating and designing web pages. It is a markup language that structures the content of a web page, defining elements such as text, images, links, forms, and more. 

.jar

Java Archive

A JAR (Java Archive) file is a standard file format used to aggregate and distribute Java classes, metadata, and resources. It is a compressed file format that simplifies the packaging of multiple Java files into a single archive, facilitating the distribution, deployment, and execution of Java applications and libraries.

System Settings

The System Settings for Upload File or Folder are designed to provide administrators and users with the flexibility to configure the behavior and display of the Upload File or Folder. These settings enable users to tailor their file upload and control the size of files according to their specific requirements. These settings can be configured from the Administration> System Settings > Others tab.

Key

Value 

Description

ovaledge.fileupload.maxfiles

10

Users can specify the maximum number of files that can be uploaded to the OvalEdge Application. The default value is set to 10, but users can enter any value in the provided field based on their specific requirements.

ovaledge.filesize.limit

2097152

Users can specify the maximum size of the file (in bytes) that can be uploaded to the OvalEdge Application.