R4Econ/amto/array/fs_ary_string.Rmd at master · FanWangEcon/R4Econ · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
---
title: "R String Arrays"
titleshort: "String Operations"
description: |
  Split, concatenate, subset, replace, and substring strings.
  Convert number to string without decimal and negative sign.
  Concatenate numeric and string arrays as a single string.
  Regular expression
core:
  - package: r
    code: |
      paste0()
      paste0(round(runif(3),3), collapse=',')
      sub()
      gsub()
      grepl()
      sprintf()
date: 2022-07-13
date_start: 2020-04-11
output:
  pdf_document:
    pandoc_args: '../../_output_kniti_pdf.yaml'
    includes:
      in_header: '../../preamble.tex'
  html_document:
    pandoc_args: '../../_output_kniti_html.yaml'
    includes:
      in_header: "../../hdga.html"
always_allow_html: true
urlcolor: blue
---

### String Arrays

```{r global_options, include = FALSE}
try(source("../../.Rprofile"))
```

`r text_shared_preamble_one`
`r text_shared_preamble_two`
`r text_shared_preamble_thr`

#### Positive or Negative Floating Number to String

There is a number, that contains decimal and possibly negative sign and has some decimals, convert this to a string that is more easily used as a file or folder name.

```{r}
ls_fl_rho <- c(1, -1, -1.5 -100, 0.5, 0.11111111, -199.22123)
for (fl_rho in ls_fl_rho) {
  st_rho <- paste0(round(fl_rho, 4))
  st_rho <- gsub(x = st_rho,  pattern = "-", replacement = "n")
  st_rho <- gsub(x = st_rho,  pattern = "\\.", replacement = "p")
  print(paste0('st_rho=', st_rho))
}
```

#### String Replace

- r string wildcard replace between regex
- [R - replace part of a string using wildcards](https://stackoverflow.com/a/27246520/8280804)

```{r, amto.array.fs_ary_string.replace, eval=FALSE}
# String replacement
gsub(x = paste0(unique(df.slds.stats.perc$it.inner.counter), ':',
                unique(df.slds.stats.perc$z_n_a_n), collapse = ';'),
     pattern = "\n",
     replacement = "")
gsub(x = var,  pattern = "\n", replacement = "")
gsub(x = var.input,  pattern = "\\.", replacement = "_")
```

String replaces a segment, search by wildcard. Given the string below, delete all text between carriage return and pound sign:

```{r}
st_tex_text <- "\n% Lat2ex Comments\n\\newcommand{\\exa}{\\text{from external file: } \\alpha + \\beta}\n% More LaLatex Comments\n"
st_clean_a1 <- gsub("\\%.*?\\\n", "", st_tex_text)
st_clean_a2 <- gsub("L.*?x", "[LATEX]", st_tex_text)
print(paste0('st_tex_text:', st_tex_text))
print(paste0('st_clean_a1:', st_clean_a1))
print(paste0('st_clean_a2:', st_clean_a2))
```

String delete after a particular string:

```{r}
st_tex_text <- "\\end{equation}\n}\n% Even more comments from Latex preamble"
st_clean_a1 <- gsub("\\\n%.*","", st_tex_text)
print(paste0('st_tex_text:', st_tex_text))
print(paste0('st_clean_a1:', st_clean_a1))
```

#### Search If and Which String Contains

- r if string contains
- r if string contains either or grepl
- [Use grepl to search either of multiple substrings in a text](https://stackoverflow.com/a/26319765/8280804)

Search for a single substring in a single string:

```{r support dtype string if contains}
st_example_a <- 'C:/Users/fan/R4Econ/amto/tibble/fs_tib_basics.Rmd'
st_example_b <- 'C:/Users/fan/R4Econ/amto/tibble/_main.html'
grepl('_main', st_example_a)
grepl('_main', st_example_b)
```

Search for if one of a set of substring exists in a set of strings. In particular which one of the elements of *ls_spn* contains at least one of the elements of *ls_str_if_contains*. In the example below, only the first path does not contain either the word *aggregate* or *index* in the path. This can be used after all paths have been found recursively in some folder to select only desired paths from the full set of possibilities:

```{r}
ls_spn <- c("C:/Users/fan/R4Econ//panel/basic/fs_genpanel.Rmd",
            "C:/Users/fan/R4Econ//summarize/aggregate/main.Rmd",
            "C:/Users/fan/R4Econ//summarize/index/fs_index_populate.Rmd")
ls_str_if_contains <- c("aggregate", "index")
str_if_contains <- paste(ls_str_if_contains, collapse = "|")
grepl(str_if_contains, ls_spn)
```

#### String Split

Given some string, generated for example by cut, get the lower cut starting points, and also the higher end point

```{r}
# Extract 0.216 and 0.500 as lower and upper bounds
st_cut_cate <- '(0.216,0.500]'
# Extract Lower Part
substring(strsplit(st_cut_cate, ",")[[1]][1], 2)
# Extract second part except final bracket Option 1
intToUtf8(rev(utf8ToInt(substring(intToUtf8(rev(utf8ToInt(strsplit(st_cut_cate, ",")[[1]][2]))), 2))))
# Extract second part except final bracket Option 2
gsub(strsplit(st_cut_cate, ",")[[1]][2],  pattern = "]", replacement = "")
```

Make a part of a string bold. Go from "ABC EFG, OPQ, RST" to "**ABC EFG**, OPQ, RST". This could be for making the name of an author bold, and preserve affiliation information.

```{r}
st_paper_author_ori <- "ABC EFG, OPQ, RST"
ar_st_ori <- strsplit(st_paper_author_ori, ", ")[[1]]
st_after_1stcomma <- paste0(ar_st_ori[2:length(ar_st_ori)], collapse= ", ")
st_paper_author <- paste0('<b>', ar_st_ori[1], "</b>, ", st_after_1stcomma )
print(st_paper_author)
```

#### String Concatenate

Concatenate string array into a single string.

```{r amto.array.fs_ary_string.concatenate}
# Simple Collapse
vars.group.by <- c('abc', 'efg')
paste0(vars.group.by, collapse='|')
```

Concatenate a numeric array into a single string.

```{r amto.array.fs_ary_string.concatenate}
# Simple Collapse
set.seed(123)
ar_fl_numbers <- runif(5)
paste0('ar_fl_numbers = ',
       paste(round(ar_fl_numbers,3), collapse=', ')
)
```

#### String Add Leading Zero

```{r amto.array.fs_ary_string.leadingzero}
# Add Leading zero for integer values to allow for sorting when
# integers are combined into strings
it_z_n <- 1
it_a_n <- 192
print(sprintf("%02d", it_z_n))
print(sprintf("%04d", it_a_n))
```

#### Substring Components

Given a string, with certain structure, get components.

- r time string get month and year and day

```{r}
snm_full <- "20100701"
snm_year <-substr(snm_full,0,4)
snm_month <-substr(snm_full,5,6)
snm_day <-substr(snm_full,7,8)
print(paste0('full:', snm_full,
             ', year:', snm_year,
             ', month:', snm_month,
             ', day:', snm_day))
```