Serialise, Compress, Encrypt, and Transfer Data
Source:R/serialise-encrypt-post.R
serialise-encrypt-post.Rd
Securely serialise, compress, encrypt, and transfer any R object with full attribute preservation and cross-platform JSON compatibility.
Usage
serialise(object, compress = TRUE, as_character = TRUE)
deserialise(object, decompress = TRUE)
encrypt(x, key = Sys.getenv("mmbi_epi_encryption_key"), as_character = TRUE)
decrypt(x, key = Sys.getenv("mmbi_epi_encryption_key"), as_character = TRUE)
post_data(
object,
url,
authorization_header = NULL,
compress = TRUE,
encrypt = TRUE,
key = Sys.getenv("mmbi_epi_encryption_key")
)
create_json_body(
object,
compress = TRUE,
encrypt = TRUE,
key = Sys.getenv("mmbi_epi_encryption_key")
)
read_json_body(
object,
decompress = TRUE,
decrypt = NULL,
key = Sys.getenv("mmbi_epi_encryption_key")
)
Arguments
- object
Any object of any size, preferably a data set
- compress, decompress
Should the serialised object be compressed/decompressed? At least allowed:
"gzip"
(orTRUE
),"bzip2"
,"xz"
, seebase::memCompress()
. UseFALSE
to not compress/decompress.- as_character
A logical to indicate whether output should be converted to a character string. Note that these have a limit of 2,147,483,647 characters (= \(2^{31} - 1\) bytes = ~2 GB in object size), so a raw vector should be used for very large inputs (i.e.,
as_character = FALSE
).- x
- key
A character to be used as the encryption key. Internally, this is converted using
openssl::sha256()
to ensure a raw high-entropy key of length32
, suitable for AES-GCM encryption. The default is the system environment variable:mmbi_epi_encryption_key
.- url
A character string specifying the target URL for the HTTP POST request. Must include the full scheme (e.g.,
"https://"
or"http://"
), hostname, and path.A character string specifying the value of the
Authorization
header to include in the POST request, e.g."Bearer <token>"
. UseNULL
to omit the header.- encrypt, decrypt
Should the serialised object be encrypted/decrypted? This applies AES-GCM via
openssl::aes_gcm_encrypt()
, providing authenticated encryption. This guarantees both confidentiality and integrity: the file cannot be read without the correctkey
, and any tampering will be detected automatically during decryption. The initialization vector (iv) will be a length-12 random raw vector.
Details
Serialisation
serialise()
converts an arbitrary R object into a transportable
format by wrapping it with metadata, including:
Object-level attributes (via
attributes()
),For data frames: per-column attributes, including class (e.g.,
factor
,Date
,POSIXct
), levels, and time zone information.
The wrapped structure is then converted to JSON using jsonlite::toJSON()
,
with consistent handling of NULL
s, NA
s, and timestamps. This structure
allows accurate reconstruction of the original object, including attributes,
when passed through deserialise()
.
The resulting JSON representation is portable and can be decoded in non-R
environments such as Python. This method avoids using base R serialize()
,
which output is R-specific and unreadable elsewhere.
Compression
If compress = TRUE
, serialise()
uses gzip compression
(memCompress(type = "gzip")
) by default. Other algorithms ("bzip2",
"xz") are supported. Compression reduces payload size but requires the same
algorithm to be used for decompression. In deserialise()
and
read_json_body()
the corresponding memDecompress()
step is
applied when decompress = TRUE
.
Encryption (AES-GCM)
encrypt()
applies AES in Galois/Counter Mode (GCM) via
openssl::aes_gcm_encrypt()
. AES-GCM provides authenticated encryption:
it guarantees confidentiality (content is unreadable without the key)
and integrity (any bit-level modification is detected during
decryption). A fresh 12-byte initialisation vector (IV) is generated for each
encryption (rand_bytes(12)
), which is required for security. Because
the IV is random/unique per call, the ciphertext differs across runs even for
identical inputs; this is expected and desirable. The IV itself is not
secret and is packaged alongside the ciphertext so decryption can succeed.
Transport
post_data()
sends the JSON body with httr::POST()
using
encode = "json"
and sets the HTTP Authorization
header if you
pass one (for example a bearer token). The receiving service can be any
stack that can: (1) parse JSON, (2) base64-decode
fields, (3) perform AES-GCM decryption with the same key and IV, (4) gzip
decompress, and (5) deserialise JSON strings.
Read in R
To decrypt, decompress, and process in R at the receiving side, do:
library(mmbi.epi)
# assuming `json_payload` is received
read_json_body(decompress = TRUE, decrypt = TRUE, key = "my-key")
Read in Python
To decrypt, decompress, and process in Python at the receiving side, do:
import json, base64, gzip
import pandas as pd
from Crypto.Cipher import AES
from Crypto.Hash import SHA256
# assuming `json_payload` is received
payload = json.loads(json_payload)
ct = base64.b64decode(payload["data"])
iv = base64.b64decode(payload["iv"])
# key derivation (same as openssl::sha256 in R)
key = SHA256.new(b"my-key").digest()
# decrypt (AES-GCM)
cipher = AES.new(key, AES.MODE_GCM, nonce=iv)
decrypted = cipher.decrypt(ct)
# decompress and parse
decompressed = gzip.decompress(decrypted)
df = pd.read_json(decompressed.decode("utf-8"))
Examples
# SERIALISATION AND ENCRYPTION -----------------------------------------
# in essence:
iris2 <- iris |> serialise() |> deserialise()
identical(iris, iris2)
#> [1] TRUE
# and:
iris3 <- iris |> serialise() |> encrypt() |> decrypt() |> deserialise()
identical(iris, iris3)
#> [1] TRUE
# a serialised object is a representation for any type of data
serialise(iris)[1:25]
#> [1] "78" "9c" "b5" "5a" "4d" "8f" "5b" "37" "0c" "fc" "2f" "ef" "3c" "30" "1e"
#> [16] "45" "52" "1f" "fe" "0d" "3d" "14" "c8" "a1" "87"
# and can be converted back at any time
iris |> serialise() |> deserialise() |> head()
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
# POSTING DATA ---------------------------------------------------------
# post_data() sends data using POST, after serialising (and encrypting)
if (FALSE) { # \dontrun{
post_data(iris,
url = "https://some-server:8000/post",
compress = TRUE,
encrypt = TRUE)
} # }
# use create_json_body() to make an encrypted JSON of an object, and
# read_json_body() to read it back
iris_json <- iris |> create_json_body(compress = TRUE, encrypt = TRUE)
# (can be sent securely to a server)
# then:
iris4 <- iris_json |> read_json_body(decompress = TRUE, decrypt = TRUE)
identical(iris, iris4)
#> [1] TRUE
# equivalent using curl:
# curl -X POST https://some-server:8000/post
# -H "Content-Type: application/json"
# -d '...'
# replace the "..." with the outcome of create_json_body():
iris_json
#> {"data":"gfc07MBwBP+nN9v5SGSclbSHisUHnLaTY+Sr7FuG40U98n4W3ITjTwahkRnIdGHhhSq00rKV\nPoCkq/wsipXdQyW+dKW+qNwOQnBtM0175+J8YuSdfQ1yQstR3iIvkBNQ8jVHU9AX1LsIQWwS\nv/zE4HwLgZVf6iC3BqRKQtT6VBuCOk/m0iMB1CKEKGAyLTFsiHqcFxv5HPWGxGoXthI4Z2xV\nbiGhe3M4bV+zw53k1LpaW7Pf/PSwbZzPbCG6kvX7TOuEeTapop921aeMAnve3MbSEMceU9Hk\nmHm+fHmOpenyS6N6Dqh0hncktuYyQXc3LKdefgceYiJc8LQ+0E4ngji3I7MzOFlb3jI4+kpj\nHCDsCywHH3A4rRkwlJXj8+V+quqCJx8jyIYlyiZ8VTNprWzEsZRJDMHC8BprguEfjVABAWxC\nPFwTC+Z9RcxJGvzleoLv6Df2gvOA6d4HToCBnWdfXTKUOjrurkQ6bHI0efm198ENto9hJDFu\ndTgcSMXNiNJjgfyl77TaCBGd/WXF+ws52jqrvSF4scvydosNwYgOOgiEMQDuv5xmWhMTsO8L\nx1vsvHANQsFZRdB9kq5O40tlvTDATWRA1qJlztF33oQsogPg5MZTUTzuetDyEbmPF/7Eudwe\navo+968mGj8uXMTJRDJSS/55bNbmJmYnQmWSVnKUV7IJd9H/Zuo05YhN70hVqKmbpr6Q3TX0\nVgQTUzM9zs8LAkkFr0e3iiRm3JsKLmta688c9a5iLcfD4s4xxq3a4V9Hx6+APSt4TUVfNddC\nb85T+ZLjbMXOKvy3/XMvooct9I+M0TIT6MNykU3NasYeoQ/8BUE6FhllaaMfCbaMWaJ+8BkY\nXOeiWdXK7QDRva63iLD0VDPRDgtrG7ne/Dft54U/KOPUkWnXm3tB4fAA2U31BhISwCeHpCH4\nAkMkXNH7YerhA1YTOxXGblDGdSIjUeJZN1l7Qkiiib4xmOUpFIA6S9T1RAv8AHxuMitNLWbb\nPssKZVoem9nD5jf102xNHjNk4IgpaCDlchfn9p0fZ0WBaCIBDT1fhk/3c68jVlUSbauML3hF\noF7/UF0/UhB0n7t4DFaHDrAmQR1m1c4AFw70F5Fyxu/gXMd9dp7SvSS5VoEftk/Ms0r99WNV\nr36IugwwuGgF7Z8Nj6IIKp88Mscjst7Lp6046NHR0GXqaMeaZ7MeUJHyjwa6vRXhOjFv63SO\n/HcJdyPfQ+urLXJvNkeBlpVyi96bFOhj18NArV2A8O5X2VikNH+l81Bhm26Y00mohKfOjNqI\nHddCt5dufuDv09hn5dIZ0On/BXGJGR3TWyuUavBK+S/4NsjVSIyo4f3K6buZ3y9bDtVFZEtI\nMMlqjq/76gdiBBzr6yaztO/b8a+/Fai2gubpqGH//m/J8VSo3aisugnD1Z3K/iCB/6pvhEwH\nQzFf2PO7redGobYdSIckH8/xvptaV/Wc8tHfl7+UE2NQDaj7IPV7vtKwd/X3FHmjiDnTGLhi\nD0Db5vOm6ZhPxkRR3EzLLKl1roOeNKVJeyyoU5kML2yiQKde/hS+doRyV7j+eqL3lQGlCJRW\nbAAii2wvQouw2Fw9EzSUBegEtg3kYzZ4t0qZfpNHS0bjQhgauZXuBs6/BElV9rjbBwTcoBJ1\nSjY2Yv/Z3zbBsOdEMEgNuZ2bPhByJ7CFLbg02V0Dv+U26yy164/JadFkafAFAloR214V3WEI\nIPr+g2X67sODA/9I0ZtwPndHntLBRh13yBcieQkVy1qqasZfwTasg40SqzZ1jUHMfw6YmXx6\nVSNtoCJkarAsAo20kPW+n/Q853rLBPgsrRwUQXZEw8wYF0Gytzpm/PA3tTXP6f8aa/WLD3z1\nVe1/y0/J/U+/YDpsMDh6HGQeEuoV6X+RIrlGNBaSNFnpQwjKTsJeMk93ouI9i6CRLKntJ7Hm\nP0ZNiFSyfcbcuVo5+npk2UI+rQbivoO32LdUyopTx1j9","iv":"UyNCfTvh49Hbknrh"}