Data Structure
Files and Permissions
Sharepoint organizes files into sites, which contain folders and files. Files and folders are accessible to users and groups through permissions (sometimes called Access Control Lists or ACLs).
Currently, Merge supports the following Sharepoint object types:
Folders
Documents
Spreadsheet
Presentation
Notebooks
The notebook is mapped as a folder, the sections are mapped as pages
Merge does not support the following object types:
Pages
Lists
Checksums
The file checksum object is used to store a Sharepoint generated checksum value, providing a way to confirm if file contents have changed. This field is mapped from the Sharepoint API, not generated by Merge.
In certain cases, Sharepoint may not generate a checksum value, or may assign a generic value. An example of a generic checksum value is:
β
"checksum": {
"type": "quickXor",
"content_hash": "AAAAAAAAAAAAAAAAAAAAAAAAAAA="
},
Users
Users are individuals with access to the Sharepoint instance. They can be associated through Groups, and granted access to files/folders/drives at the group level.
Users can also be associated to a domain (e.g. "@merge.dev"). Sharepoint does not support whitelisting of external domains, meaning there is typically only one domain-level permission for a Sharepoint instance. This will show up as a permission of type "COMPANY" in the Merge API.
Groups
Sharepoint has several types of groups. Merge supports Azure AD groups and Microsoft 365 Groups, which are the modern standard for managing SharePoint access.
Sharepoint also supports Sharepoint Site groups. These are typically identified by the group name pattern "{Site name} Owners", "{Site name} Members", "and "{Site name} Visitors". Merge will create these as group common models, but they will not contain any users and will not have the remote_id field populated. Microsoft is phasing out site groups as they are part of a legacy permission model rarely used in current deployments.
β
Ingestion
Sync cadence
Merge polls the Sharepoint API at regular intervals, defined in this table, for updates to all common models. The sync interval is limited by rate limits Sharepoint enforces on their APIs. Merge will always use timestamp filtering to ensure polling is as efficient as possible.
Webhooks
Sharepoint supports webhooks that notify Merge about key events, and allow for near real-time updates. Webhooks are subscribed at the site level, and will trigger on the following events:
File update
File create
File delete
Folder update
Folder create
Folder delete
A notable exception is around permissions updates. Sharepoint will not emit a webhook when permissions change on a file or folder.
For most use cases, webhooks will not be broad enough or reliable enough to guarantee data accuracy. Polling should always set the minimum data freshness benchmark for your application.
Mime Types
Sharepoint allows two possible export types on their Graph API. Merge allows passing a mime_type parameter for our /direct-download endpoint only. The proxy download endpoint will return the default Sharepoint export type, and does not currently accept the mime_type parameter.
Download mime type | Description | Supported native file type |
Converts the item into PDF format. | csv, doc, docx, odp, ods, odt, pot, potm, potx, pps, ppsx, ppsxm, ppt, pptm, pptx, rtf, xls, xlsx | |
html | Converts the item into HTML format. | loop, fluid, wbtx |
Authentication
Supported authentication types
Merge supports 3 authentication types for Sharepoint. For a description of the scopes requested for each type, refer to this guide.
Admin read & write
Merge will launch a Sharepoint OAuth flow, which will prompt your customer for their username and password.
Merge will have access to all shared drives and folders.
This authentication method is only recommended if your application needs to POST files back to Sharepoint.
Admin read only
Merge will launch a Sharepoint OAuth flow, which will prompt your customer for their username and password.
This is the most common authentication method used, as it grants Merge a single set of credentials to access all shared drives and folders.
Non-admin read only
Merge will launch a Sharepoint OAuth flow, which will prompt your customer for their username and password.
The authenticating user can be anyone with valid Sharepoint log in credentials.
Merge will only have access to the authenticated users personal drive, as well as shared drives and files explicitly shared with them.
Authentication errors
When authenticating, Merge will validate that the provided credentials have the requested access level. In the section below, we go over the possible failure modes.
404 on /v1.0/users/{user_id}/drives: a 404 error on a request to a user's drive indicates that the drive does not exist. This typically happens when the user has not logged into their account yet, and therefore has not instantiated their personal drive.
403 on /v1.0/users/{user_id}/drives: a 403 error on a request to a user's drive indicates that it not accessible to the admin who authenticated the connection. This could be intentional, depending on the specifics of the Sharepoint user's instance.