|
| 1 | +# HAT File Storage |
| 2 | + |
| 3 | +File storage is a rather different beast from structured (meta)data about your personal digital life: |
| 4 | + |
| 5 | +- they are big to store |
| 6 | +- sending them back and forth requires a lot of bandwidth |
| 7 | +- databases very useful for storing structured data are not well-suited for storing files |
| 8 | +- data in a file can not normally be sliced and diced as structured metadata could |
| 9 | +- files would often be uploaded from a low-bandwidth (e.g. mobile) device that is likely to have intermittent connectivity |
| 10 | + |
| 11 | +These considerations further play with the requirements for file storage to be reliable, secure and cost-efficient especially if we want to make HATs affordable. And we still want to be able to attach metadata to files, maintain fine-granularity access control. |
| 12 | + |
| 13 | +APIs are also rather different from web pages in terms of how file uploads tend to be or can be handled. Not going into a lot of detail here there are plenty of reasons why [uploading using multipart forms mostly sucks](https://philsturgeon.uk/api/2016/01/04/http-rest-api-file-uploads/) as well as pretty good examples on how file uploads could be done, e.g. [YouTube Resumable Uploads](https://developers.google.com/youtube/v3/guides/using_resumable_upload_protocol). As a building block to satisfy all the requirements we picked AWS S3 |
| 14 | + |
| 15 | +## Uploads |
| 16 | + |
| 17 | +File upload happens in three steps: |
| 18 | + |
| 19 | +1. Posting file metadata to the HAT and retrieving URL to send the file to directly |
| 20 | +2. Directly uploading the file to the (securely signed) URL |
| 21 | +3. Marking the file complete at the HAT |
| 22 | + |
| 23 | +Uploading metadata is simple: call `POST /api/v2/files/upload` with the file details: |
| 24 | + |
| 25 | +```curl |
| 26 | + curl -X POST -H "Accept: application/json" -H "X-Auth-Token: ${HAT_AUTH_TOKEN}" \ |
| 27 | + -H "Content-Type: application/json" \ |
| 28 | + -d '{ |
| 29 | + "name": "testFile.png", |
| 30 | + "source": "test", |
| 31 | + "tags": ["tag1", "tag2"], |
| 32 | + "title": "test Title", |
| 33 | + "description": "a very interesting test file", |
| 34 | + }' \ |
| 35 | + "https://${HAT_ADDRESS}/api/v2/files/upload" |
| 36 | +``` |
| 37 | + |
| 38 | +Only `name` and `source` properties are mandatory - all others are optional. You can also attach `dateCreated` and `lastUpdated` fields with Unix timestamps to set them accordingly. If everything is successful, the HAT will respond with a copy of the metadata as well as additional information: |
| 39 | + |
| 40 | +``` |
| 41 | +{ |
| 42 | + "fileId": "testtestfile-12.png", |
| 43 | + "name": "testFile.png", |
| 44 | + "source": "test", |
| 45 | + "tags": ["tag1", "tag2"], |
| 46 | + "title": "test Title", |
| 47 | + "description": "a very interesting test file", |
| 48 | + "dateCreated": 1487871142325, |
| 49 | + "lastUpdated": 1487871142329, |
| 50 | + "status": { |
| 51 | + "status": "New" |
| 52 | + }, |
| 53 | + "contentUrl": "https://hat-storage-test.s3.amazonaws.com/HAT_ADDRESS/testtestfile-12.png?AWSAccessKeyId=AKIAJSOXH3FJPB43SWGQ&Expires=1487871442&Signature=CTRdDW8nKBqNcuwK0ssH77zjkec%3D", |
| 54 | + "contentPublic": false, |
| 55 | + "permissions": [ |
| 56 | + { |
| 57 | + "userId": "694dd8ed-56ae-4910-abf1-6ec4887b4c42", |
| 58 | + "contentReadable": true |
| 59 | + } |
| 60 | + ] |
| 61 | +} |
| 62 | +``` |
| 63 | + |
| 64 | +Importantly, it includes the unique file identifier for the HAT `fileId` and `contentUrl` indicating where the file should be uploaded. The upload `contentUrl` is signed and has limited duration validity, most likely 5 minutes, after which it becomes invalid. Then uploading itself could be done as (*note:* `x-amz-server-side-encryption` header is mandatory): |
| 65 | + |
| 66 | +```curl |
| 67 | +curl -v -T ${LOCAL_FILE} \ |
| 68 | + -H "x-amz-server-side-encryption: AES256"\ |
| 69 | + "https://hat-storage-test.s3.amazonaws.com/HAT_ADDRESS/testtestfile-12.png?AWSAccessKeyId=AKIAJSOXH3FJPB43SWGQ&Expires=1487871442&Signature=CTRdDW8nKBqNcuwK0ssH77zjkec%3D" |
| 70 | +``` |
| 71 | + |
| 72 | +Finally, to mark the file "Completed", call `PUT /api/v2/files/file/:fileId/complete`. It will again respond with file metadata: |
| 73 | + |
| 74 | +``` |
| 75 | +{ |
| 76 | + "fileId": "postmantestfile-12.png", |
| 77 | + "name": "testFile.png", |
| 78 | + "source": "postman", |
| 79 | + "tags": ["tag1", "tag2"], |
| 80 | + "title": "test Title", |
| 81 | + "description": "a very interesting test file", |
| 82 | + "dateCreated": 1487871142325, |
| 83 | + "lastUpdated": 1487871142329, |
| 84 | + "status": { |
| 85 | + "size": 154639, |
| 86 | + "status": "Completed" |
| 87 | + }, |
| 88 | + "contentPublic": false, |
| 89 | + "permissions": [ |
| 90 | + { |
| 91 | + "userId": "694dd8ed-56ae-4910-abf1-6ec4887b4c42", |
| 92 | + "contentReadable": true |
| 93 | + } |
| 94 | + ] |
| 95 | +} |
| 96 | +``` |
| 97 | + |
| 98 | +File `status` has now been marked as `Completed` and also contains file size in bytes! The request will fail if the file doesn't exist, hasn't been fully uploaded or you do not have permissions to mark the file completed (you will if you started the upload in the first place). |
| 99 | + |
| 100 | +Finally, files can be deleted (by *owner* only!) by calling `DELETE /api/v2/files/file/:fileId` |
| 101 | + |
| 102 | +## Viewing contents |
| 103 | + |
| 104 | +- `GET api/v2/files/file/:fileId` to list metadata of a file, including `contentUrl` pointing to a pre-signed temporary URL for file contents if the user is permitted file access |
| 105 | +- `GET /api/v2/files/content/:fileId` to get contents of a file if file is marked publicly accessible or the client is permitted file content access. The endpoint redirects to the pre-signed temporary content URL or returns 404 error code (Not Found) if the file does not exist or is not accessible |
| 106 | + |
| 107 | +## Access Control |
| 108 | + |
| 109 | +HAT owner can access all files, but otherwise there are three options for file access: |
| 110 | + |
| 111 | +- another HAT user can be marked to have access to file's metadata |
| 112 | +- another HAT user can be marked to have access to both file's metadata and contents |
| 113 | +- a file can be marked to have its contents publicly accessible (e.g. publishing photos or your book!) |
| 114 | + |
| 115 | +By default, the user who saved the file onto the HAT is allowed to see the file's metadata and contents, but only the `owner` can adjust file permissions by: |
| 116 | + |
| 117 | +- calling `GET /api/v2/files/allowAccess/:fileId/:userId` to allow a specific user (`:userId`) to access a specific file (`:fileId`), optionally setting `content` query parameter to `true`/`false` to control content access (`false` by default). Conversely, calling `GET /api/v2/files/restrictAccess/:fileId/:userId` to restrict access. |
| 118 | +- calling `POST /api/v2/files/allowAccess/:userId` sending file template to grant access to a set of files (same syntax as for file search!). Conversely, calling `POST /api/v2/files/restrictAccess/:userId` to restrict access. |
| 119 | +- calling `GET /api/v2/files/allowAccessPublic/:fileId` and `GET /api/v2/files/restrictAccessPublic/:fileId` to control public file access |
| 120 | + |
| 121 | +## Search |
| 122 | + |
| 123 | +HAT files can be looked up by any part of metadata attached to them: |
| 124 | + |
| 125 | +- `fileId` for an exact match, where one or no files are returned |
| 126 | +- `name` for an exact match on the original name, but multiple files could potentially be returned. *Empty* string if you do not want to match against `name` |
| 127 | +- `source` matching all files from a specific source such as `facebook`. *Empty* string if you do not want to match against `source` |
| 128 | +- `tags` a set of all tags matching files need to have attached |
| 129 | +- `title` and `description` for an approximate, text-based search matching the fields |
| 130 | +- `status` to filter e.g. only files that are marked `Completed` |
| 131 | + |
| 132 | +To search for files call `POST /api/v2/files/search` sending file template to match against. All calls must be authenticated with the user's token and only files the user is allowed to access are returned (all files for the HAT owner!): |
| 133 | + |
| 134 | +```curl |
| 135 | + curl -X POST -H "Accept: application/json" -H "X-Auth-Token: ${HAT_AUTH_TOKEN}" \ |
| 136 | + -H "Content-Type: application/json" \ |
| 137 | + -d '{ |
| 138 | + "name": "testFile.png", |
| 139 | + "source": "test", |
| 140 | + "tags": ["tag1", "tag2"], |
| 141 | + "title": "test Title", |
| 142 | + "description": "a very interesting test file", |
| 143 | + "status": { "status": "Completed", "size": 0} |
| 144 | + }' \ |
| 145 | + "https://${HAT_ADDRESS}/api/v2/files/search" |
| 146 | +``` |
0 commit comments