9.2.3 Binary Protocol

--------------------------------------------------------------------------------
EVENTQL - EVENTQL BINARY PROTOCOL v1
--------------------------------------------------------------------------------
v0.2 - September, 2016                             Paul Asmuth 

Table of Contents

  1. Preface
  2. Overview
  3. Life of a Connection
  4. Notation
  5. Frame header
    5.1 opcode
    5.2 flags
    5.3 length
  6. Messages
    6.1 Client-to-Server (C2S) Messages
      6.1.1 HELLO
      6.1.2 PING
      6.1.3 HEARTBEAT
      6.1.4 ERROR
      6.1.5 READY
      6.1.6 BYE
      6.1.7 ACK
      6.1.8 QUERY
      6.1.9 QUERY_RESULT
      6.1.10 QUERY_CONTINUE
      6.1.11 QUERY_DISCARD
      6.1.12 QUERY_PROGRESS
      6.1.13 QUERY_NEXT
      6.1.14 INSERT
    6.2 Server-to-Server (S2S) Messages
      6.2.1 QUERY_PARTIALAGGR
      6.2.2 QUERY_PARTIALAGGR_RESULT
      6.2.3 QUERY_REMOTE
      6.2.4 QUERY_REMOTE_RESULT
      6.2.5 REPL_INSERT

1. Preface

The EventQL binary protocol took a lot of ideas from the MySQL and Cassandra
Binary Protocols.


2. Overview

  The EventQL binary protocol is a frame based protocol. Frames are defined as:

      0         8        16        24        32
      +---------+---------+-------------------+
      |      opcode       |      flags        |
      +---------+---------+-------------------+
      |                length                 |
      +---------+---------+---------+---------+
      |                                       |
      .           ...  payload ...            .
      .                                       .
      +----------------------------------------

  Each frame contains a fixed size header (8 bytes) followed by a variable size
  payload. The header is described in Section 5. The content of the payload
  depends on the header opcode value (the payload can in particular be empty for
  some opcode values). The list of allowed opcode is defined Section 5.4 and the
  details of each corresponding message is described Section 6.

  Note to client implementors: clients library should always assume that the
  payload of a given frame may contain more data than what is described in this
  document. It will however always be safe to ignore the remaining of the frame
  payload in such cases. The reason is that this may allow to sometimes extend
  the protocol with optional features without needing to change the protocol
  version.

  The EventQL protocol is generally speaking a simple request/response protocol.
  However, some requests will result in more than one response frame and some
  opcodes may be sent by the client even though another request has not completed
  yet (such as the KILL opcode). Still, the protocol does not allow "multiplexing"
  of multiple queries or independent operations onto a single connection at the
  same time.


3. Life of a Connection

  The lifetime of a connection consists of two phases: Initiation and Operation.
  A connection is always initiated by the client. To initiate a query, the client
  connects to the server and sends the initial HELLO frame. In the most simple
  case the server responds with a HELLO_ACK frame and the connection transitions
  to the operational phase. Depending on the auth method, more frames might
  be exchanged during the initiation process.

  Once the query is in operational state, the client may initiate a new
  request by sending one of these opcodes:

  After the request has completed, the client may reuse the connection to
  issue another request. At which point a request is considered complete depends
  on the specific opcode and is describe in the entry for the corresponding opcode
  in section 6.

  Note that some opcodes allow the client to send a subset of other opcodes while
  the request is still running.


4. Notation

  To describe the layout of the message payloads in the following sections,
  we define the following:

  [lenencint]      A Little Endian Base 128 encoded integer

  [lenencint*]     A list of [lenenint] values

  [lenencstr]      A length encoded string is a string that is prefixed with
                   a [lenenceint] describing the length of the string.

  [lenencstr*]     A list of [lenencstr] values

  [cstr]           A zero-byte-terminated string

  [cstr*]          A list of [cstr] values

  [zerobyte]       A single '\0' byte

  We also define this notation for optinal fields:

      | if 


5. Frame header

5.1. opcode

  A 16-bit big-endian integer that distinguishes the type of message:
    0x5e00   HELLO
    0x0001   PING
    0x0002   HEARTBEAT
    0x0003   ERROR
    0x0004   READY
    0x0005   BYE
    0x0006   QUERY
    0x0007   QUERY_RESULT
    0x0008   QUERY_CONTINUE
    0x0009   QUERY_DISCARD
    0x000a   QUERY_PROGRESS
    0x000b   QUERY_NEXT
    0x000f   ACK
    0x0010   INSERT
    0x0101   QUERY_PARTIALAGGR
    0x0102   QUERY_PARTIALAGGR_RESULT
    0x0103   QUERY_REMOTE
    0x0104   QUERY_REMOTE_RESULT
    0x0110   REPL_INSERT
    0x0200   META_PERFORMOP
    0x0201   META_PERFORMOP_RESULT
    0x0202   META_CREATEFILE
    0x0203   META_GETFILE
    0x0204   META_GETFILE_RESULT
    0x0205   META_DISCOVER
    0x0206   META_DISCOVER_RESULT
    0x0207   META_LISTPARTITIONS
    0x0208   META_LISTPARTITIONS_RESULT
    0x0209   META_FINDPARTITION
    0x020a   META_FINDPARTITION_RESULT

  Messages are described in Section 6.

5.2. flags

  Flags applying to this frame.

    0x01  EVQL_ENDOFREQUEST
          This is the last response frame for the currently running request


5.3. length

  A 32-bit big-endian integer representing the length of the payload of the
  frame (note: currently a frame is limited to 256MB in length).


6. Messages

6.1.2 HELLO

  FIXME

  Payload:
    lenencint      protocol_version
    lenencstr      eventql_version
    lenencint      flags
    lenencint      idle_timeout
    lenencint      authdata_len
    cstr*          authdata
    lenencstr      database              | if HELLO_SWITCHDB

  Fields:
    flags          0x01 HELLO_INTERNAL - This is an internal S2S connection
    flags          0x02 HELLO_SWITCHDB - Switch to a database on connect
    flags          0x02 HELLO_INTERACTIVEAUTH - Enable interactive auth

  Response:
    


6.1.2 PING

  On an operational connection, the ping message may be sent at _any_ time by
  either the client or the server, even while a request is running.

  The PING message may safely be ignored.

  Payload:
    

  Response:
    


6.1.4 ERROR

  FIXME

  Payload:
    lenencstr      error_string
    zerobyte

  Response:
    


6.1.5 READY

  FIXME

  Payload:
    lenencint      flags
    lenencint      idle_timeout_us

  Response:
    


6.1.8 QUERY

  The QUERY message initiates a new request and is only sent by the client.
  A QUERY message must not be sent while another request is still running.

  Payload:
    lenencstr      query
    lenencint      flags
    lenencint      max_rows
    lenencstr      database        | if flags 0x02 SWITCH_DATABASE is set

  Fields:
    max_rows       The maximum number of rows that should be returned in the
                   first response frame. A value of zero indicates that as many
                   rows as possible should be returned in the first response
                   frame

    flags          0x01 QUERY_SWITCHDB - Switch database
                   0x02 QUERY_MULTISTMT - Allow multiple result sets
                   0x04 QUERY_PROGRESS - Send progress frames
                   0x08 QUERY_NOSTATS - Omit stats in query result

    query          The zero-terminated query string


  Response:
    After sending a query message, the client should expect to receive one of
    these response messages:
      - ERROR
      - QUERY_PROGRESS
      - QUERY_RESULT

    The request is considered completed once any one of these conditions is
    satisfied:
      - The server has responseded with a ERROR message
      - The server has responded with a QUERY_RESULT message with the COMPLETE
        flag set

    After the request is complete, the server will not send any further messages.


6.1.9 QUERY_RESULT

  The QUERY_RESULT message contains a description of the result table as well
  as zero or more rows from the result tables. Results are always paged, i.e.
  depending on the result size the result table will be split into many
  QUERY_RESULT frames.

  After receiving a QUERY_RESULT frame, the client must check if the COMPLETED
  flag is set. If the flag is set, the query can be considered complete and the
  client should not expect any further frames from the server.

  If the flag is not set, the client must either discard the remainder of the
  result by sending a QUERY_DISCARD frame or request the next result frame by
  sending the QUERY_CONTINUE frame.

  In a series of QUERY_RESULT frames, only the first frame will contain the
  result table schema information.

  Payload:
    lenencint      flags
    lenencint      num_result_columns
    lenencint      num_result_rows
    lenencint      num_rows_modified   | if EVQL_QUERY_RESULT_HASSTATS
    lenencint      num_rows_scanned    | if EVQL_QUERY_RESULT_HASSTATS
    lenencint      num_bytes_scanned   | if EVQL_QUERY_RESULT_HASSTATS
    lenencint      query_runtime_ms    | if EVQL_QUERY_RESULT_HASSTATS
    lenencstr*     column_names        | if EVQL_QUERY_RESULT_HASCOLNAMES
    lenencstr*     data

  Fields:
    flags          0x01 QUERY_RESULT_COMPLETE - This is the last result frame
                   0x02 QUERY_RESULT_HASSTATS - This result frame includes stats
                   0x04 QUERY_RESULT_HASCOLNAMES - This frame includes colnames
                   0x08 QUERY_RESULT_PENDINGSTMT - This is NOT the last statement

  Response:
    If the COMPLETE flag is not set, the client must respond with
    QUERY_CONTINUE or QUERY_DISCARD.

    If the COMPLETE flag is set, but the PENDINGSTMT flag is also set, the
    client must respond with QUERY_NEXT or QUERY_DISCARD.

    Otherwise, the client must not send any response.


6.1.10 QUERY_CONTINUE

  The QUERY_CONTINUE message may be sent by the client in response to a
  QUERY_RESULT message that did not have the COMPLETED flag set.

  Payload:
    

  Response:
    After sending a QUERY_CONTINUE message, the client should expect to receive
    one of these response messages:
      - PING
      - ERROR
      - QUERY_RESULT


6.1.11 QUERY_DISCARD

  The QUERY_DISCARD message may be sent by the client in response to a
  QUERY_RESULT message that did not have the COMPLETED flag set.

  Payload:
    

  Response:
    After sending a QUERY_CONTINUE message, the client should expect to receive
    one of these response messages:
      - PING
      - ERROR
      - QUERY_RESULT


6.1.12 QUERY_PROGRESS

  The QUERY_PROGRESS message is an information message from the server to the
  client. It contains information about the progress of query execution and
  may be received after sending a QUERY frame and before the first QUERY_RESULT
  frame is received.

  The QUERY_PROGRESS message may safely be ignored.

  Payload:
    lenencint      num_rows_modified
    lenencint      num_rows_scanned
    lenencint      num_bytes_scanned
    lenencint      query_progress_permill
    lenencint      query_elapsed_ms
    lenencint      query_eta_ms

  Fields:
    FIXME

  Response:
    


6.1.14 INSERT

  The INSERT frame is sent from a client to a server to start an insert
  operation. The INSERT message initiates a new request. A INSERT message must
  not be sent while another request is still running.

  Payload:
    lenencint      flags
    lenencstr      database
    lenencstr      table
    lenencint      records_encoding
    lenenecstr     records_encoding_info | if EVQL_INSERT_HAS_ENCODING_INFO
    lenencint      records_count
    lenenecstr     *records

  Fields:
    flags                  Currently unused
    database               The database in which the records should be inserted
    table                  The table in which the records should be inserted
    records_count          The number of records in this INSERT frame
    records                The list of records
    records_encoding       The record encoding:
                             0x01  JSON
                             0x02  CSV
    records_encoding_info  Set to the csv header if the encoding is CSV

  Flags:
    0x01  EVQL_INSERT_HAS_ENCODING_INFO
          Set if the records_encoding_info field is set

  Response:
    After sending an INSERT message, the client should expect to receive one of
    these response messages:
      - ERROR
      - HEARTBEAT
      - ACK

    The request is considered completed once the client has received a response
    frame with the EVQL_ENDOFREQUEST flag set.  After the request is complete,
    the server will not send any further messages.


6.2.1 QUERY_PARTIALAGGR

  The QUERY_PARTIALAGGR frame is used to execute a partial aggregation on a
  remote server.

  The QUERY_PARTIALAGGR message initiates a new request. A QUERY_PARTIALAGGR
  message must not be sent while another request is still running.

  Payload:
    lenencint      flags
    lenencstr      database
    lenencstr      encoded_qtree

  Fields:
    flags          Currently unused
    datbase        The database on which the query should be executed
    encoded_qtree  The encoded query tree

  Response:
    After sending a QUERY_PARTIALAGGR message, the client should expect to
    receive one of these response messages:
      - ERROR
      - HEARTBEAT
      - QUERY_PARTIALAGGR_RESULT

    The request is considered completed once the client has received a response
    frame with the EVQL_ENDOFREQUEST flag set.  After the request is complete,
    the server will not send any further messages.


6.2.2 QUERY_PARTIALAGGR_RESULT

  The QUERY_PARTIALAGGR_RESULT message contains a the result of a successful
  QUERY_PARTIALAGGR call. Results may be paged, i.e. depending on the result size
  the result will be split into many QUERY_PARTIALAGGR_RESULT frames.

  After receiving a QUERY_PARTIALAGGR_RESULT frame, the client must check if the
  EVQL_ENDOFREQUEST flag is set. If the flag is set, the request can be
  considered complete and the client should not expect any further frames from
  the server. If the flag is not set, the client must wait for the next
  QUERY_PARTIALAGGR_RESULT frame.

  Payload:
    lenencint      flags
    lenencint      num_rows
    lenencstr*     body

  Fields:
    flags          Currently unused
    num_rows       The number of result rows in this frame
    body           A list of group, scratch pairs (num_rows * 2 strings)

  Response:
    No response.


6.2.3 QUERY_REMOTE

  FIXME

  Payload:
    lenencint      flags
    lenencstr      database
    lenencstr      encoded_qtree

  Fields:
    flags          Currently unused
    datbase        The database on which the query should be executed
    encoded_qtree  The encoded query tree

  Response:
    After sending a QUERY_PARTIALAGGR message, the client should expect to
    receive one of these response messages:
      - ERROR
      - HEARTBEAT
      - QUERY_REMOTE_RESULT

    The request is considered completed once the client has received a response
    frame with the EVQL_ENDOFREQUEST flag set.  After the request is complete,
    the server will not send any further messages.


6.2.4 QUERY_REMOTE_RESULT

  The QUERY_REMOTE_RESULT message contains a the result of a successful
  QUERY_REMOTE call. Results may be paged, i.e. depending on the result size
  the result will be split into many QUERY_REMOTE_RESULT frames.

  After receiving a QUERY_REMOTE_RESULT frame, the client must check if the
  EVQL_ENDOFREQUEST flag is set. If the flag is set, the request can be
  considered complete and the client should not expect any further frames from
  the server. If the flag is not set, the client must wait for the next
  QUERY_REMOTE_RESULT frame.

  Payload:
    lenencint      flags
    lenencint      column_count
    lenencint      row_count
    lenencstr      row_data

  Fields:
    flags          Currently unused
    num_rows       The number of result rows in this frame

  Response:
    No response.


6.2.5 REPL_INSERT

  FIXME

  Payload:
    lenencint      flags
    lenencstr      database
    lenencstr      table
    lenencstr      partition_id
    lenencstr      body

  Fields:
    flags          Currently unused
    database       Database for insert
    table          Table for insert
    partition_id   Partition for insert
    body           Encoded shredded record list

  Response:
    After sending a REPL_INSERT message, the client should expect to receive
    one of these response messages:
      - ERROR
      - HEARTBEAT
      - ACK

    The request is considered completed once the client has received a response
    frame with the EVQL_ENDOFREQUEST flag set.  After the request is complete,
    the server will not send any further messages.