Howto read a yaml file

Support/help with CloverETL (4.9) and CloverDX (5.0 or newer) implementation problems

User avatar
heffner
Posts: 1
Joined: Wed May 06, 2015 3:14 pm
Location: Hamburg, Germany
Contact:

Howto read a yaml file

Postby heffner » Thu Mar 03, 2016 2:17 pm

I would like to read a yaml configuration for processing, e.g. a commented version may look like:

Code: Select all

absolute_urls: false                        # Absolute or relative URLs for `base_url`
timezone: ''                                # Valid values: http://php.net/manual/en/timezones.php
default_locale:                             # Default locale (defaults to system)
param_sep: ':'                              # Parameter separator, use ';' for Apache on windows
wrapped_site: false                         # For themes/plugins to know if Grav is wrapped by another platform
reverse_proxy_setup: false                  # Running in a reverse proxy scenario with different webserver ports than proxy
proxy_url:                                  # Configure a manual proxy URL for GPM (eg 127.0.0.1:3128)

languages:
  supported: []                             # List of languages supported. eg: [en, fr, de]
  include_default_lang: true                # Include the default lang prefix in all URLs
  translations: true                        # Enable translations by default
  translations_fallback: true               # Fallback through supported translations if active lang doesn't exist
  session_store_active: false               # Store active language in session
  http_accept_language: false               # Attempt to set the language based on http_accept_language header in the browser
  override_locale: false                    # Override the default or system locale with language specific one

home:
  alias: '/home'                            # Default path for home, ie /
  hide_in_urls: false                       # Hide the home route in URLs

pages:
  theme: antimatter                         # Default theme (defaults to "antimatter" theme)
  order:
    by: default                             # Order pages by "default", "alpha" or "date"
    dir: asc                                # Default ordering direction, "asc" or "desc"
  list:
    count: 20                               # Default item count per page
  dateformat:
    default:                                # The default date format Grav expects in the `date: ` field
    short: 'jS M Y'                         # Short date format
    long: 'F jS \a\t g:ia'                  # Long date format
  publish_dates: true                       # automatically publish/unpublish based on dates
  process:
    markdown: true                          # Process Markdown
    twig: false                             # Process Twig
  twig_first: false                         # Process Twig before markdown when processing both on a page
  events:
    page: true                              # Enable page level events
    twig: true                              # Enable twig level events
  markdown:
    extra: false                            # Enable support for Markdown Extra support (GFM by default)
    auto_line_breaks: false                 # Enable automatic line breaks
    auto_url_links: false                   # Enable automatic HTML links
    escape_markup: false                    # Escape markup tags into entities
    special_chars:                          # List of special characters to automatically convert to entities
      '>': 'gt'
      '<': 'lt'
  types: [txt,xml,html,htm,json,rss,atom]   # list of valid page types
  append_url_extension: ''                  # Append page's extension in Page urls (e.g. '.html' results in /path/page.html)
  expires: 604800                           # Page expires time in seconds (604800 seconds = 7 days)
  last_modified: false                      # Set the last modified date header based on file modifcation timestamp
  etag: false                               # Set the etag header tag
  vary_accept_encoding: false               # Add `Vary: Accept-Encoding` header
  redirect_default_route: false             # Automatically redirect to a page's default route
  redirect_default_code: 301                # Default code to use for redirects
  redirect_trailing_slash: true             # Handle automatically or 301 redirect a trailing / URL
  ignore_files: [.DS_Store]                 # Files to ignore in Pages
  ignore_folders: [.git, .idea]             # Folders to ignore in Pages
  ignore_hidden: true                       # Ignore all Hidden files and folders
  url_taxonomy_filters: true                # Enable auto-magic URL-based taxonomy filters for page collections

cache:
  enabled: true                             # Set to true to enable caching
  check:
    method: file                            # Method to check for updates in pages: file|folder|none
  driver: auto                              # One of: auto|file|apc|xcache|memcache|wincache
  prefix: 'g'                               # Cache prefix string (prevents cache conflicts)
  lifetime: 604800                          # Lifetime of cached data in seconds (0 = infinite)
  gzip: false                               # GZip compress the page output

twig:
  cache: true                               # Set to true to enable twig caching
  debug: false                              # Enable Twig debug
  auto_reload: true                         # Refresh cache on changes
  autoescape: false                         # Autoescape Twig vars
  undefined_functions: true                 # Allow undefined functions
  undefined_filters: true                   # Allow undefined filters
  umask_fix: false                          # By default Twig creates cached files as 755, fix switches this to 775

assets:                                     # Configuration for Assets Manager (JS, CSS)
  css_pipeline: false                       # The CSS pipeline is the unification of multiple CSS resources into one file
  css_minify: true                          # Minify the CSS during pipelining
  css_minify_windows: false                 # Minify Override for Windows platforms. False by default due to ThreadStackSize
  css_rewrite: true                         # Rewrite any CSS relative URLs during pipelining
  js_pipeline: false                        # The JS pipeline is the unification of multiple JS resources into one file
  js_minify: true                           # Minify the JS during pipelining
  enable_asset_timestamp: false             # Enable asset timestamps
  collections:
    jquery: system://assets/jquery/jquery-2.x.min.js

errors:
  display: false                            # Display full backtrace-style error page
  log: true                                 # Log errors to /logs folder

debugger:
  enabled: false                            # Enable Grav debugger and following settings
  shutdown:
    close_connection: true                  # Close the connection before calling onShutdown(). false for debugging

images:
  default_image_quality: 85                 # Default image quality to use when resampling images (85%)
  cache_all: false                          # Cache all image by default
  cache_perms: '0755'                       # MUST BE IN QUOTES!! Default cache folder perms. Usually '0755' or '0775'
  debug: false                              # Show an overlay over images indicating the pixel depth of the image when working with retina for example

media:
  enable_media_timestamp: false             # Enable media timetsamps
  upload_limit: 0                           # Set maximum upload size in bytes (0 is unlimited)
  unsupported_inline_types: []              # Array of supported media types to try to display inline
  allowed_fallback_types: []                # Array of allowed media types of files found if accessed via Page route

session:
  enabled: true                             # Enable Session support
  timeout: 1800                             # Timeout in seconds
  name: grav-site                           # Name prefix of the session cookie. Use alphanumeric, dashes or underscores only. Do not use dots in the session name
  secure: false                             # Set session secure. If true, indicates that communication for this cookie must be over an encrypted transmission. Enable this only on sites that run exclusively on HTTPS
  httponly: true                            # Set session HTTP only. If true, indicates that cookies should be used only over HTTP, and JavaScript modification is not allowed.


I'm thinking about using a complex data reader, but on the other hand it's well sctructed like json or xml.

Does anyone already tried this? Solutions or suggestions welcome :)

Thanks
Stephan

slechtaj
Posts: 192
Joined: Wed Aug 15, 2012 8:18 am

Re: Howto read a yaml file

Postby slechtaj » Tue Mar 08, 2016 6:26 pm

Hi Stephan,

If the yaml file structure looks always like this, you may use Pivot component instead of configuring ComplexDataReader. The process might look like this:
  1. Read rows from the YAML file one by one using UniversalDataReader
  2. Trim the strings (since the file contains a lot of white spaces that you might not want to map into fields).
  3. Using Pivot component transpose the input records into fields in metadata of the output record. (see the attached example)
Of course if the YAML file was more structured etc. it might need to be done another way, but for the mentioned structure this is sufficient.

Hope it helps.
Attachments
readYaml.grf
(3.35 KiB) Downloaded 230 times
Jan Slechta
CloverCARE Support
CloverETL | Rapid Data Integration

Visit us online at http://www.cloveretl.com

How to speed up communication with CloverCARE support