[−][src]Trait bstr::ByteVec
A trait that extends Vec<u8>
with string oriented methods.
Note that when using the constructor methods, such as
ByteVec::from_slice
, one should actually call them using the concrete
type. For example:
use bstr::{B, ByteVec}; let s = Vec::from_slice(b"abc"); // NOT ByteVec::from_slice("...") assert_eq!(s, B("abc"));
Provided methods
fn from_slice<B: AsRef<[u8]>>(bytes: B) -> Vec<u8>
Create a new owned byte string from the given byte slice.
Examples
Basic usage:
use bstr::{B, ByteVec}; let s = Vec::from_slice(b"abc"); assert_eq!(s, B("abc"));
fn from_os_string(os_str: OsString) -> Result<Vec<u8>, OsString>
Create a new byte string from an owned OS string.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this returns the original OS string if it is not valid UTF-8.
Examples
Basic usage:
use std::ffi::OsString; use bstr::{B, ByteVec}; let os_str = OsString::from("foo"); let bs = Vec::from_os_string(os_str).expect("valid UTF-8"); assert_eq!(bs, B("foo"));
fn from_os_str_lossy<'a>(os_str: &'a OsStr) -> Cow<'a, [u8]>
Lossily create a new byte string from an OS string slice.
On Unix, this always succeeds, is zero cost and always returns a slice. On non-Unix systems, this does a UTF-8 check. If the given OS string slice is not valid UTF-8, then it is lossily decoded into valid UTF-8 (with invalid bytes replaced by the Unicode replacement codepoint).
Examples
Basic usage:
use std::ffi::OsStr; use bstr::{B, ByteVec}; let os_str = OsStr::new("foo"); let bs = Vec::from_os_str_lossy(os_str); assert_eq!(bs, B("foo"));
fn from_path_buf(path: PathBuf) -> Result<Vec<u8>, PathBuf>
Create a new byte string from an owned file path.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this returns the original path if it is not valid UTF-8.
Examples
Basic usage:
use std::path::PathBuf; use bstr::{B, ByteVec}; let path = PathBuf::from("foo"); let bs = Vec::from_path_buf(path).expect("must be valid UTF-8"); assert_eq!(bs, B("foo"));
fn from_path_lossy<'a>(path: &'a Path) -> Cow<'a, [u8]>
Lossily create a new byte string from a file path.
On Unix, this always succeeds, is zero cost and always returns a slice. On non-Unix systems, this does a UTF-8 check. If the given path is not valid UTF-8, then it is lossily decoded into valid UTF-8 (with invalid bytes replaced by the Unicode replacement codepoint).
Examples
Basic usage:
use std::path::Path; use bstr::{B, ByteVec}; let path = Path::new("foo"); let bs = Vec::from_path_lossy(path); assert_eq!(bs, B("foo"));
fn push_byte(&mut self, byte: u8)
Appends the given byte to the end of this byte string.
Note that this is equivalent to the generic Vec::push
method. This
method is provided to permit callers to explicitly differentiate
between pushing bytes, codepoints and strings.
Examples
Basic usage:
use bstr::ByteVec; let mut s = <Vec<u8>>::from("abc"); s.push_byte(b'\xE2'); s.push_byte(b'\x98'); s.push_byte(b'\x83'); assert_eq!(s, "abc☃".as_bytes());
fn push_char(&mut self, ch: char)
Appends the given char
to the end of this byte string.
Examples
Basic usage:
use bstr::ByteVec; let mut s = <Vec<u8>>::from("abc"); s.push_char('1'); s.push_char('2'); s.push_char('3'); assert_eq!(s, "abc123".as_bytes());
fn push_str<B: AsRef<[u8]>>(&mut self, bytes: B)
Appends the given slice to the end of this byte string. This accepts
any type that be converted to a &[u8]
. This includes, but is not
limited to, &str
, &BStr
, and of course, &[u8]
itself.
Examples
Basic usage:
use bstr::ByteVec; let mut s = <Vec<u8>>::from("abc"); s.push_str(b"123"); assert_eq!(s, "abc123".as_bytes());
fn into_string(self) -> Result<String, FromUtf8Error> where
Self: Sized,
Self: Sized,
Converts a Vec<u8>
into a String
if and only if this byte string is
valid UTF-8.
If it is not valid UTF-8, then a
FromUtf8Error
is returned. (This error can be used to examine why UTF-8 validation
failed, or to regain the original byte string.)
Examples
Basic usage:
use bstr::ByteVec; let bytes = Vec::from("hello"); let string = bytes.into_string()?; assert_eq!("hello", string);
If this byte string is not valid UTF-8, then an error will be returned. That error can then be used to inspect the location at which invalid UTF-8 was found, or to regain the original byte string:
use bstr::{B, ByteVec}; let bytes = Vec::from_slice(b"foo\xFFbar"); let err = bytes.into_string().unwrap_err(); assert_eq!(err.utf8_error().valid_up_to(), 3); assert_eq!(err.utf8_error().error_len(), Some(1)); // At no point in this example is an allocation performed. let bytes = Vec::from(err.into_vec()); assert_eq!(bytes, B(b"foo\xFFbar"));
fn into_string_lossy(self) -> String where
Self: Sized,
Self: Sized,
Lossily converts a Vec<u8>
into a String
. If this byte string
contains invalid UTF-8, then the invalid bytes are replaced with the
Unicode replacement codepoint.
Examples
Basic usage:
use bstr::ByteVec; let bytes = Vec::from_slice(b"foo\xFFbar"); let string = bytes.into_string_lossy(); assert_eq!(string, "foo\u{FFFD}bar");
unsafe fn into_string_unchecked(self) -> String where
Self: Sized,
Self: Sized,
Unsafely convert this byte string into a String
, without checking for
valid UTF-8.
Safety
Callers must ensure that this byte string is valid UTF-8 before
calling this method. Converting a byte string into a String
that is
not valid UTF-8 is considered undefined behavior.
This routine is useful in performance sensitive contexts where the
UTF-8 validity of the byte string is already known and it is
undesirable to pay the cost of an additional UTF-8 validation check
that into_string
performs.
Examples
Basic usage:
use bstr::ByteVec; // SAFETY: This is safe because string literals are guaranteed to be // valid UTF-8 by the Rust compiler. let s = unsafe { Vec::from("☃βツ").into_string_unchecked() }; assert_eq!("☃βツ", s);
fn into_os_string(self) -> Result<OsString, Vec<u8>> where
Self: Sized,
Self: Sized,
Converts this byte string into an OS string, in place.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this returns the original byte string if it is not valid UTF-8.
Examples
Basic usage:
use std::ffi::OsStr; use bstr::ByteVec; let bs = Vec::from("foo"); let os_str = bs.into_os_string().expect("should be valid UTF-8"); assert_eq!(os_str, OsStr::new("foo"));
fn into_os_string_lossy(self) -> OsString where
Self: Sized,
Self: Sized,
Lossily converts this byte string into an OS string, in place.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.
Note that this can prevent the correct roundtripping of file paths on non-Unix systems such as Windows, where file paths are an arbitrary sequence of 16-bit integers.
Examples
Basic usage:
use bstr::ByteVec; let bs = Vec::from_slice(b"foo\xFFbar"); let os_str = bs.into_os_string_lossy(); assert_eq!(os_str.to_string_lossy(), "foo\u{FFFD}bar");
fn into_path_buf(self) -> Result<PathBuf, Vec<u8>> where
Self: Sized,
Self: Sized,
Converts this byte string into an owned file path, in place.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this returns the original byte string if it is not valid UTF-8.
Examples
Basic usage:
use bstr::ByteVec; let bs = Vec::from("foo"); let path = bs.into_path_buf().expect("should be valid UTF-8"); assert_eq!(path.as_os_str(), "foo");
fn into_path_buf_lossy(self) -> PathBuf where
Self: Sized,
Self: Sized,
Lossily converts this byte string into an owned file path, in place.
On Unix, this always succeeds and is zero cost. On non-Unix systems, this will perform a UTF-8 check and lossily convert this byte string into valid UTF-8 using the Unicode replacement codepoint.
Note that this can prevent the correct roundtripping of file paths on non-Unix systems such as Windows, where file paths are an arbitrary sequence of 16-bit integers.
Examples
Basic usage:
use bstr::ByteVec; let bs = Vec::from_slice(b"foo\xFFbar"); let path = bs.into_path_buf_lossy(); assert_eq!(path.to_string_lossy(), "foo\u{FFFD}bar");
fn pop_byte(&mut self) -> Option<u8>
Removes the last byte from this Vec<u8>
and returns it.
If this byte string is empty, then None
is returned.
If the last codepoint in this byte string is not ASCII, then removing the last byte could make this byte string contain invalid UTF-8.
Note that this is equivalent to the generic Vec::pop
method. This
method is provided to permit callers to explicitly differentiate
between popping bytes and codepoints.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foo"); assert_eq!(s.pop_byte(), Some(b'o')); assert_eq!(s.pop_byte(), Some(b'o')); assert_eq!(s.pop_byte(), Some(b'f')); assert_eq!(s.pop_byte(), None);
fn pop_char(&mut self) -> Option<char>
Removes the last codepoint from this Vec<u8>
and returns it.
If this byte string is empty, then None
is returned. If the last
bytes of this byte string do not correspond to a valid UTF-8 code unit
sequence, then the Unicode replacement codepoint is yielded instead in
accordance with the
replacement codepoint substitution policy.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foo"); assert_eq!(s.pop_char(), Some('o')); assert_eq!(s.pop_char(), Some('o')); assert_eq!(s.pop_char(), Some('f')); assert_eq!(s.pop_char(), None);
This shows the replacement codepoint substitution policy. Note that
the first pop yields a replacement codepoint but actually removes two
bytes. This is in contrast with subsequent pops when encountering
\xFF
since \xFF
is never a valid prefix for any valid UTF-8
code unit sequence.
use bstr::ByteVec; let mut s = Vec::from_slice(b"f\xFF\xFF\xFFoo\xE2\x98"); assert_eq!(s.pop_char(), Some('\u{FFFD}')); assert_eq!(s.pop_char(), Some('o')); assert_eq!(s.pop_char(), Some('o')); assert_eq!(s.pop_char(), Some('\u{FFFD}')); assert_eq!(s.pop_char(), Some('\u{FFFD}')); assert_eq!(s.pop_char(), Some('\u{FFFD}')); assert_eq!(s.pop_char(), Some('f')); assert_eq!(s.pop_char(), None);
fn remove_char(&mut self, at: usize) -> char
Removes a char
from this Vec<u8>
at the given byte position and
returns it.
If the bytes at the given position do not lead to a valid UTF-8 code unit sequence, then a replacement codepoint is returned instead.
Panics
Panics if at
is larger than or equal to this byte string's length.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foo☃bar"); assert_eq!(s.remove_char(3), '☃'); assert_eq!(s, b"foobar");
This example shows how the Unicode replacement codepoint policy is used:
use bstr::ByteVec; let mut s = Vec::from_slice(b"foo\xFFbar"); assert_eq!(s.remove_char(3), '\u{FFFD}'); assert_eq!(s, b"foobar");
fn insert_char(&mut self, at: usize, ch: char)
Inserts the given codepoint into this Vec<u8>
at a particular byte
position.
This is an O(n)
operation as it may copy a number of elements in this
byte string proportional to its length.
Panics
Panics if at
is larger than the byte string's length.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foobar"); s.insert_char(3, '☃'); assert_eq!(s, "foo☃bar".as_bytes());
fn insert_str<B: AsRef<[u8]>>(&mut self, at: usize, bytes: B)
Inserts the given byte string into this byte string at a particular byte position.
This is an O(n)
operation as it may copy a number of elements in this
byte string proportional to its length.
The given byte string may be any type that can be cheaply converted
into a &[u8]
. This includes, but is not limited to, &str
and
&[u8]
.
Panics
Panics if at
is larger than the byte string's length.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foobar"); s.insert_str(3, "☃☃☃"); assert_eq!(s, "foo☃☃☃bar".as_bytes());
fn replace_range<R, B>(&mut self, range: R, replace_with: B) where
R: RangeBounds<usize>,
B: AsRef<[u8]>,
R: RangeBounds<usize>,
B: AsRef<[u8]>,
Removes the specified range in this byte string and replaces it with the given bytes. The given bytes do not need to have the same length as the range provided.
Panics
Panics if the given range is invalid.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foobar"); s.replace_range(2..4, "xxxxx"); assert_eq!(s, "foxxxxxar".as_bytes());
fn drain_bytes<R>(&mut self, range: R) -> DrainBytes<'_>ⓘNotable traits for DrainBytes<'a>
impl<'a> Iterator for DrainBytes<'a> type Item = u8;
where
R: RangeBounds<usize>,
Notable traits for DrainBytes<'a>
impl<'a> Iterator for DrainBytes<'a> type Item = u8;
R: RangeBounds<usize>,
Creates a draining iterator that removes the specified range in this
Vec<u8>
and yields each of the removed bytes.
Note that the elements specified by the given range are removed regardless of whether the returned iterator is fully exhausted.
Also note that is is unspecified how many bytes are removed from the
Vec<u8>
if the DrainBytes
iterator is leaked.
Panics
Panics if the given range is not valid.
Examples
Basic usage:
use bstr::ByteVec; let mut s = Vec::from("foobar"); { let mut drainer = s.drain_bytes(2..4); assert_eq!(drainer.next(), Some(b'o')); assert_eq!(drainer.next(), Some(b'b')); assert_eq!(drainer.next(), None); } assert_eq!(s, "foar".as_bytes());