-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
94 lines (61 loc) · 5.81 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
<!-- badges: start -->
[![R build status](https://github.com/yanlesin/SEC13Flist/workflows/R-CMD-check/badge.svg)](https://github.com/yanlesin/SEC13Flist/actions)
[![codecov](https://codecov.io/github/yanlesin/SEC13Flist/branch/master/graphs/badge.svg)](https://codecov.io/gh/yanlesin/SEC13Flist/branch/master)
[![SEC13Flist status badge](https://yanlesin.r-universe.dev/badges/SEC13Flist)](https://yanlesin.r-universe.dev/SEC13Flist)
<!-- badges: end -->
# SEC13Flist
The goal of SEC13Flist package is to provide functions to work with official list of Section 13(f) Securities.
Functions `SEC_13F_list` and `SEC_13F_list_local` parses PDF list from [SEC.gov](https://www.sec.gov/divisions/investment/13flists.htm) based on supplied year and quarter and returns data frame with list of securities, maintaining the same structure as official list. Functions appends YEAR and QUARTER columns to the list. Returned data frame could be customized and filtered according to your needs.
`SEC_13F_list` function reaches to [SEC.gov](https://www.sec.gov/divisions/investment/13flists.htm) website and requires tweaks if landing page changes. In case of a breaking change on landing page, you can use `SEC_13F_list_local` function to parse file downloaded to local folder.
`SEC_13F_list` function requires setup of user agent prior to attempting download from sec.gov website. For details how to setup user agent and maximum request rate please refer to [https://www.sec.gov/os/accessing-edgar-data](https://www.sec.gov/os/accessing-edgar-data).
User agent could be setup via `options(HTTPUserAgent=...)`.
Functions `isCusip`, `isSedol`, and `isIsin` verify checksum digit of security identifiers based on leading characters of the identifier (except last checksum digit). Functions returns `TRUE`/`FALSE` for correct/incorrect identifier.
CUSIP, ISIN, and SEDOL checksum calculation pseudo code located at [Wikipedia - CUSIP](https://en.wikipedia.org/wiki/CUSIP), [Wikipedia - SEDOL](https://en.wikipedia.org/wiki/SEDOL), [Wikipedia - ISIN](https://en.wikipedia.org/wiki/International_Securities_Identification_Number) and R/C/C++ implementation is at [Rosettacode - CUSIP](https://rosettacode.org/wiki/CUSIP#C.2B.2B), [Rosettacode - SEDOL](https://rosettacode.org/wiki/SEDOLs#R), and [Rosettacode - ISIN](https://rosettacode.org/wiki/Validate_International_Securities_Identification_Number#C)
## Installation
You can install current development version from [GitHub](https://github.com/yanlesin/SEC13Flist) with:
```{R installation, eval=FALSE, include=TRUE}
remotes::install_github("yanlesin/SEC13Flist")
```
## Description of returned data for `SEC_13F_list`
`CUSIP`: chr - CUSIP number of the security
`HAS_LISTED_OPTION`: chr - An asterisk indicates that security having a listed option and each option is individually listed with its own CUSIP number immediately below the name of the security having the option
`ISSUER_NAME`: chr - Issuer Name
`ISSUER_DESCRIPTION`: chr - Issuer Description
`STATUS`: chr - "ADDED" (The security has become a Section 13(f) security) or "DELETED" (The security ceases to be a 13(f) security since the date of the last list)
`YEAR`: int - Year of the list
`QUARTER`: int - Quarter of the list
## Examples
These are basic examples of usage:
```{r example, eval=FALSE, include=TRUE}
library(SEC13Flist)
library(tidyverse)
## Return list for Q3 2018
SEC13Flist_2018_Q3 <- SEC_13F_list(2018,3)
## Customizing
SEC13Flist_current <- SEC_13F_list(2023, 3) |>
filter(STATUS!="DELETED") |> #Filter out records with STATUS "DELETED"
select(-YEAR,-QUARTER) #Remove YEAR and QUARTER columns
## Verifying CUSIP
verify_CUSIP <- SEC_13F_list(2023, 3) |>
rowwise() |> ##CUSIPs are not unique, isCusip function is not vectorized and requires single nine character CUSIP as input
mutate(VALID_CUSIP=isCusip(CUSIP)) ##validating CUSIP
```
## Use of CUSIP Codes
According to FAQ section of [CUSIP Global Services](https://www.cusip.com/cusip/cgs-license-fees.htm):
>Can firms take CGS Data from public sources and create their own database without signing a license agreement with CGS?
>CGS Data is publicly available in some offering documents and from other sources. Firms can elect to collect this information and store it in their internal databases for non-commercial use, provided that the source of such information permitted the reproduction and use of such information. However, CGS's experience has been that the CGS data generally has not come from publicly available sources but rather from other sources such as a CGS Authorized Distributor or through improperly scraping websites of CGS customers with valid CGS’ licenses. Most end-user customers of CGS Data prefer to enter into a license agreement with CGS for authorized use and to enjoy the benefits of the integrity and functionality of downloadable, timely and accurate data (either from CGS directly or from an Authorized Distributor).
## Known issues with CUSIP codes supplied in SEC's Official List of 13(f) securities
[This discussion at stackexchange](https://quant.stackexchange.com/questions/16392/sec-13f-security-list-has-incorrect-cusip-numbers) describes problem with CUSIP codes for CALL and PUT options that is still present at current list.
[This discussion at FundApps support article](https://fundapps.zendesk.com/hc/en-us/articles/204837769-13F-list-Option-CUSIP-matching) describes how FundApps (software provider for regulatory compliance) addresses quality issue for CUSIP codes including all option securities with the same first six-character subset of CUSIP code as main issue (* for HAS_LISTED_OPTION field in the list).