QueryParamsProcessor

Reusable; allows any controller to handle queryParams (e.g. :index action)

Notes
  • Purpose: Provides consistent and centralized handling of common query parameters for API endpoints. Designed to process parameters for filtering, sorting, limiting, pagination, and more.


  • Filtering:

    • Can specify which columns to include via the fields parameter.

    • Alternatively, columns to exclude can be specified using the exclude parameter.

    • Default Behavior: If neither fields nor exclude is provided, all columns will be selected and returned.

  • Sorting:

    • The sort parameter defines the column by which the data should be sorted.

    • The order parameter defines the direction of the sort (e.g., ascending, descending, newest, oldest).

    • Default behavior: If neither sort nor order is specified, the default sort field is created_at and the default order is desc.

  • Limit & Pagination:

    • The limit parameter allows the user to restrict the number of records returned.

    • Pagination is achieved through either the page or offset parameter.

      • Note: The page method requires a pagination gem (e.g., kaminari, will_paginate).

    • Default Behavior: If neither limit nor page/offset is specified, all records will be returned without pagination.

      • Note: This could result in large datasets being fetched.

  • Additional Query Params:

    • While they're commented out, there are placeholders for potential future implementations of other common query parameters like where, group, having, distinct, include, joins, and select.

    • Security considerations are mentioned for potential pitfalls with some of these methods (e.g., direct use of user input without sanitization).

  • Private Helper Methods:

    • determine_sort_field: Ensures a valid column is chosen for sorting, defaulting to created_at.

    • column_exists?: Checks if a column exists in the model being queried.

    • determine_direction: Deciphers the desired sort direction based on the provided order parameter.


  • Error Handling & Security:

    • There's a noted need for adding error handling to ensure the robustness of the system.

    • It's emphasized that direct user input should be sanitized and validated to avoid potential security risks, such as SQL injection attacks.

  • Performance Considerations:

    • In operations that could return large datasets (e.g., broad selects without limits), there's a cautionary note about potential performance issues, suggesting regular monitoring and the use of tools like rack-mini-profiler for performance checking.

More Notes (using query params for GET requests)
  1. Conventional RESTful Semantics: According to RESTful principles, GET requests should be idempotent, meaning they don't cause side-effects or modify the state. Using query parameters to filter, sort, limit, and paginate on :index and :show actions aligns well with this idea, as you're essentially specifying how to "view" or "retrieve" the data.

  2. Enhances User Experience and Developer Efficiency:

    • Flexibility for Clients: This pattern gives frontend clients (or other services) the flexibility to retrieve data in the way they need without requiring changes on the backend each time.

    • Reduced Endpoints: Instead of creating multiple specialized endpoints for different views of the data, you can adjust the view via query parameters.

  3. Performance Implications:

    • For :show actions, where you're retrieving a single record, the primary utility of query parameters is to filter fields, as sorting, limiting, and paginating don't typically apply. Being selective about the fields returned can improve performance, especially if the model has many attributes or if some attributes are computationally expensive to generate.

    • For :index actions, all the functionalities (filtering, sorting, limiting, and paginating) play a crucial role. Limiting and paginating are particularly important for performance reasons. Without them, you might end up fetching a huge amount of data which could slow down the system.

  4. Security and Data Integrity:

    • Always be cautious about exposing fields, especially sensitive ones. Even if a field is not typically shown in the UI, a savvy user could manually add it to the fields query parameter. Ensure sensitive fields are either not accessible or are explicitly whitelisted/blacklisted.

    • Validating the query parameters as you've done, especially for sorting and field selection, helps mitigate risks of SQL injection.

  5. Maintainability:

    • Using a module (QueryParamsProcessor) to handle this functionality helps maintain separation of concerns, keeping the controllers lean.

    • This pattern is extensible. If in the future you decide to support more advanced querying capabilities, having a dedicated module will make it easier to expand upon.

  6. Documentation and Training:

    • Ensure that this feature is well-documented so frontend developers know how to use it effectively.

    • Since this pattern isn't a Rails default, it's essential that any new team members are made aware of it and trained on how to use and extend it.

  7. Consideration for Other Actions:

    • For actions like :create, :update, or :delete, the primary data interchange should be in the request body, not the query parameters. The semantics of query parameters (optional, not affecting state) are a better fit for read-only operations.

Details about Filtering, Sorting, Limiting, Pagination

FILTER (fields, exclude)

  • Restrictions: whitelisting/blacklisting keys (per model)

    • Array of keys

    • Model file

    • Dynamically whitelist/blacklist based on user / role (e.g. admin vs other)

  • Concerns:

    • Users get access to data they shouldn’t be able to

SORT (sort, order)

  • Restrictions: …minimal

    • Internal business logic or system behavior

      • sorting by created_at/updated_at = could give insights into internal activity or update patterns)

    • Expensive operations

      • Sorting by certain fields could be expensive

  • Concerns:

    • Sorting can potentially be an expensive operation, especially on large datasets

      • index fields that allow sorting

    • SQL attacks

      • Always ensure that sorting parameters are not directly interpolated into SQL queries.

      • Using ActiveRecord's built-in query methods as you've done is a good approach because it provides a level of protection against SQL injection attacks.

LIMIT (limit)

  • Restrictions: max limit

    • (without, a user could potentially request vast number of records, leading to resource exhaustion)

    • **if user specifies higher than limit, just round down to limit

    • **also consider a default limit

  • Concerns:

    • DoS attacks (Denial of Service — via continuously making requests that pull large amounts of data)

PAGINATION (offset/page)

  • Restrictions: max page size

    • **if user specifies higher than limit, just round down to limit

    • **also consider a default page size

    • **offset-based vs cursor-based

  • Concerns:

    • Pagination can potentially be an expensive operation, especially on large datasets

      • index fields that allow pagination


Error handling

  • "Consider adding error handling. For instance, if someone passes a non-integer for limit, it might cause an error. Validating user input before using it would be beneficial."

Avoiding SQL injection attacks

  • Sanitize (user input) & validate

    • **use ActiveRecord's built-in methods to prevent injection

    • (e.g. sort method = “uses AR’s built-in query methods)

Implement in controller (items_controller.rb):

class ItemsController < ApplicationController
  include QueryParamsProcessor

  def index
  
    ...
    # Process query params
    @items = process_query_params(@items)
    ...
    
    render json: @items
  end
end

Logic:

# app/controllers/concerns/query_params_processor.rb
module QueryParamsProcessor
  extend ActiveSupport::Concern

  # NOTES:
  #   - Error Handling: Consider adding error handling. For instance, if someone passes a non-integer for limit, it might cause an error. Validating user input before using it would be beneficial.
  #   - Performance: For operations that could return a large dataset (like a broad select without limits), be aware of potential performance issues. Always monitor and check query performance, especially when the dataset grows. Tools like rack-mini-profiler can help with this.
  #   - Security: The comments below indicate the right level of caution ("IMPORTANT"). Especially, always be wary of using direct user input in queries. Always sanitize and validate. For the methods you've commented out (like WHERE and JOIN), you're absolutely correct about the risks of SQL injection. Never pass user input directly into such methods. Always use ActiveRecord's built-in methods to prevent injection.

  ALLOWED_SORT_DIRECTIONS = ['asc', 'desc', 'newest', 'oldest']

  def process_query_params(resource)
    # FILTER
    # (e.g. api.get('.../users?fields=id,name,email')) (inclusive)
    # (e.g. api.get('.../users?exclude=id,name,email')) (exclusive)
    # (e.g. SELECT id, name, email FROM users)
    # Used for selecting specific fields
    # Common keys (inclusive): 'filter', 'fields', 'columns'
    # Common keys (exclusive): 'exclude'
    # Keys used: 'fields', 'exclude'
    resource = filter_fields(resource)

    # SORT
    # (e.g. api.get('.../users?sort=role'))
    # (e.g. SELECT * FROM users ORDER BY role)
    # Used for sorting
    # Common keys: 'sort', 'order', 'order_by', 'order_dir', 'sort_by', 'sort_dir', 'sort_field'
    # Keys used: 'sort' (field), 'order' (direction)
    resource = sort_resource(resource)

    # LIMIT
    # (e.g. api.get('.../users?limit=10'))
    # (e.g. SELECT * FROM users LIMIT 10)
    # Used for pagination
    # Common keys: 'limit', 'per_page'
    resource = resource.limit(params[:limit].to_i) if params[:limit].present?

    # PAGE
    # (e.g. api.get('.../users?page=2'))
    # (e.g. SELECT * FROM users LIMIT 10 OFFSET 10)
    # Used for pagination
    # Common keys: 'page', 'offset'
    # resource = resource.page(params[:page].to_i) if params[:page].present? # '.page()' = requires gem (e.g. kaminari, will_paginate) (1-based)
    # resource = resource.offset(params[:page].to_i) if params[:page].present? # '.offset()' = native to Rails (0-based) (using 'page' key) ('page' is generally preferred...  but probably don't want to use 'page' for key and also be 0-based)
    resource = resource.offset(params[:offset].to_i) if params[:offset].present? # '.offset()' = native to Rails (0-based) (using 'offset' key)

    
    
    ## ADDITIONAL COMMON QUERY PARAMS (consider in future)
    ##################################
    
    # WHERE
    # (e.g. api.get('.../users?where=role=manager'))
    # (e.g. SELECT * FROM users WHERE role = 'manager')
    # Used for filtering
    # Common keys: 'where', 'filter'
    # IMPORTANT: this is not a very secure way to filter (e.g. SQL injection attacks) (always sanitize user input) (this may go for GROUP, HAVING, DISTINCT, etc. as well)
    # resource = resource.where(params[:where]) if params[:where].present?


    # GROUP
    # (e.g. api.get('.../users?group=role'))
    # (e.g. SELECT role, count(*) FROM users GROUP BY role)
    # Used for aggregations (e.g. count, sum, average, etc.)
    # Common keys: 'group', 'group_by'
    # resource = resource.group(params[:group]) if params[:group].present?

    # HAVING
    # (e.g. api.get('.../users?having=role=manager'))
    # (e.g. SELECT role, count(*) FROM users GROUP BY role HAVING role = 'manager')
    # Used for aggregations (e.g. count, sum, average, etc.)
    # Common keys: 'having'
    # resource = resource.having(params[:having]) if params[:having].present?

    # DISTINCT
    # (e.g. api.get('.../users?distinct=true'))
    # (e.g. SELECT DISTINCT role FROM users)
    # Used for aggregations (e.g. count, sum, average, etc.)
    # Common keys: 'distinct', 'uniq'
    # resource = resource.distinct if params[:distinct].present?

    # INCLUDE
    # (e.g. api.get('.../users?include=projects'))
    # (e.g. SELECT * FROM users LEFT OUTER JOIN projects ON projects.user_id = users.id)
    # Used for eager loading (e.g. to avoid N+1 queries) (https://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations)
    # Common keys: 'include', 'includes'
    # resource = resource.includes(params[:include]) if params[:include].present?

    # JOINS
    # (e.g. api.get('.../users?joins=projects'))
    # (e.g. SELECT * FROM users INNER JOIN projects ON projects.user_id = users.id)
    # Used for eager loading (e.g. to avoid N+1 queries)
    # Common keys: 'joins', 'join'
    # resource = resource.joins(params[:joins]) if params[:joins].present?

    # SELECT
    # (e.g. api.get('.../users?select=id,name,email'))
    # (e.g. SELECT id, name, email FROM users)
    # Used for selecting specific fields
    # Common keys: 'fields', 'columns'
    # resource = resource.select(params[:select]) if params[:select].present?

    resource
  end

  private



  ## MAIN METHODS
  def filter_fields(resource)
    if params[:fields].present? && params[:fields] != '*'# Example: api.get('.../users?fields=id,name,email')
      selected_fields = params[:fields].split(',') # ['id', 'name', 'email']
      # Ensure only valid columns are selected
      selected_fields = selected_fields.select { |field| sort_field_exists?(field) }
      resource = resource.select(selected_fields) unless selected_fields.empty?  # SELECT id, name, email FROM users
    elsif params[:exclude].present? # Example: api.get('.../users?exclude=id,name,email')
      all_fields = resource.klass.column_names
      exclude_fields = params[:exclude].split(',')
      selected_fields = all_fields - exclude_fields
      resource = resource.select(selected_fields) unless selected_fields.empty?  # Example: SELECT created_at, updated_at FROM users
    end
    resource
  end


  def sort_resource(resource)
    sort_field = params[:sort].present? ? determine_sort_field(params[:sort], resource) : nil
    direction = params[:order].present? ? determine_direction(params[:order]) : nil

    # Default behavior if neither is present
    return resource unless sort_field || direction
    
    # Default to 'created_at' if only direction is specified
    sort_field ||= 'created_at'
  
    # Default to 'desc' if only sort_field is specified
    direction ||= 'desc'

    resource.order(sort_field => direction)
  end


  ## HELPER METHODS
  def determine_sort_field(sort, resource)
    if column_exists?(sort, resource)
      sort
    else
      'created_at'
    end
  end


  def column_exists?(field, resource)
    resource.klass.column_names.include?(field)
  end

  def determine_direction(order)
    return 'desc' unless ALLOWED_SORT_DIRECTIONS.include?(order) # default is 'desc' (newest)
    
    case order
      when 'asc', 'desc'
        order
      when 'newest'
        'desc'
      when 'oldest'
        'asc'
      else
        'desc'
    end
  end
end

Last updated