Table of content:
- Introduction to Strings
- How to convert String into Array
- How to conver Array into String
- How to repeat String n times
- How to get part of a String
- How to reverse a String
- About Strings at a low level
- Unicode/UTF-X terminology brief overview
- How to get code of Сhar
- How to get Сhar from code
Introduction to Strings
At first view, strings a very similar to arrays (both of them has length property, accsess to elements by index, and many of the same methods):
const str = 'foo' console.log(str.length) // 3 console.log(str) // 'f' console.log(str.concat('bar')) // 'foobar'
const str = 'foo' // will create a new instance of a string console.log(str.repeat(2)) // 'foofoo' console.log(str) // 'foo'
How to convert String into Array
Use split method to convert string into array by some separator:
const str = 'foo bar baz' str.split(' ') // ['foo', 'bar', 'baz']
To split by each symbol you can pass a empty string as method argument:
const str = 'foo' str.split('') // ['f', 'o', 'o']
Also, you can call method as tag function:
const str = 'foo' str.split`` // ['f', 'o', 'o']
How to conver Array into String
The oposite method is join:
const arr = ['foo', 'bar', 'baz'] arr.join(' ') // 'foo bar baz'
How to repeat String n times
In the past, there’re a few ways by generation empty arrays and joining them with a target string, but ES6 offers repeat method as a more elegant way for repeating a string:
'hi!'.repeat(3) // 'hi!hi!hi!'
How to get part of a String
For getting some part of string you can use a slice method. It has a interesting modes (direction from string start or end) depanding on arguments:
'abcde'.slice(1) // 'bcde' 'abcde'.slice(0, 2) // 'ab' 'abcde'.slice(0, -1) // 'abcd' 'abcde'.slice(-3, -1) // 'cd' 'abcde'.slice(-3) // 'cde'
How to reverse a String
It’s a so common interview question. The most easier way it’s using a array reverse method (strings don’t have this method):
'abcde'.split('').reverse().join('') // edcba
About Strings at a low level
As we mentioned before, string is a sequence of elements (actually we called them ‘characters’), but which physical properties does this sequence of elements have?
Elements are just 16-bit unsigned integer values (each element is a
UTF-16 code point, see below info about Unicode). The maximum length of the string is
Unicode/UTF-X terminology brief overview:
Unicode- Universal Coded Character Set
UTF-16- Unicode Standard, that uses 16-bits for each code unit
Code unit- part of code point
Code point- an atomic unit of information (in simple words - symbol, that consists of one or more code units). In
UTF-16each code point represented by 1 or 2 code unit.
How to get code of Сhar
Each element is 16-bit integer, but how to get that value? Old good method is charCodeAt:
'A'.charCodeAt(0) // 65 'B'.charCodeAt(0) // 66 'a'.charCodeAt(0) // 97
But it works only to
0x10000 (65,536) while avaliable values are ranged from
0x10FFFF (1,114,111). And to handle values above that limit (like emoji) you have to play around pair of values (code units, aka a surrogate pairs). And actually all built-in string methods/properties work with code units:
'😜'.length // 2 '😜'.split('') // ['�', '�'] '😜'.split('').map((u) => u.charCodeAt(0)) // [55357, 56860]
To handle these methods in the context of pairs of code units you have to write custom functions or use some libraries (we’re skipping that part). But, there’re good news. ES6 offers some of these methods. One of them
codePointAt that returns an integer, that epresents
'😜'.charCodeAt(0) // 55357, wrong (only a first unit) '😜'.codePointAt(0) // 128540, correct
How to get Сhar from code
The opposite action is getting a char element from code. And here are two ways, too.
String.fromCharCode works with limited range of values (subset of most popular unicode symbols), and
String.fromCodePoint that works opposite to
String.fromCharCode(65) // 'A' String.fromCharCode(65, 97) // 'Aa' String.fromCharCode(55357, 56860) // '😜' String.fromCodePoint(128540) // '😜'