C++03 5.1 Primary expressions
§2:
A literal is a primary expression. Its type depends on its form (2.13). A string literal is an lvalue; all other literals are rvalues.
What is the rationale behind this?
As I understand, string literals are objects, while all other literals are not.And an l-value always refers to an object.
But the question then is why are string literals objects while all other literals are not?
This rationale seems to me more like an egg or chicken problem.
I understand the answer to this may be related to hardware architecture rather than C/C++ as programming languages, nevertheless I would like to hear the same.
Note: I am tagging this question as c & c++ both because C99 standard also has similar quotations, specifically §6.5.1.4
alias<T[N]> {}
is possible now. U {}.arr
is also an rvalue of array type if arr
is declared as such in the class definition for U
- Luc Danton 2012-04-04 03:50
&
operator". I suspect that definition is actually equivalent to the standard's definition, unless I'm missing something.. - R.. 2012-04-04 04:10
&
, but are lvalues. Also, I'm rather unclear on why it's (presumably) invalid to apply &
to the return value of a function, which is specified to have object type.. - R.. 2012-04-04 04:56
A string literal is a literal with array type, and in C there is no way for an array type to exist in an expression except as an lvalue. String literals could have been specified to have pointer type (rather than array type that usually decays to a pointer) pointing to the string "contents", but this would make them rather less useful; in particular, the sizeof
operator could not be applied to them.
Note that C99 introduced compound literals, which are also lvalues, so having a literal be an lvalue is no longer a special exception; it's closer to being the norm.
puts("hello")
an example of an expression with an array type that could be an rvalue - Pubby 2012-04-04 03:38
"hello"
is not an rvalue. It's an lvalue array which decays to an expression of type pointer-to-char
- R.. 2012-04-04 03:43
struct x { int a[2]; }; struct x foo(void);
then foo().a
is an rvalue array. Also, given struct x bar, quux;
then (1 ? bar : quux).a
is an rvalue array - caf 2012-04-04 04:21
+1
has object type (int
) but is not ordinarily considered an lvalue. Note that Example 1 in C99 §6.5.2.3 specifically calls out f().x
as being "a valid postfix expression but is not an lvalue" - caf 2012-04-04 06:24
const
type, but there is an implicit conversion which will remove the const. In both cases, these special rules only apply to string literals. - James Kanze 2012-04-04 07:42
struct
return values. The standard is pretty weak in terms of describing what one can do with them, though. The big issue in implementations is that they may (or may not) be stored in registers (for sufficiently small structures) or similar "ephemeral" storage, and array manipulation—even something as simple as subscripting to extract one element—can overwrite this storage; but "normal" array access requires a fairly durable pointer to the base of the array. How long is that pointer valid? Who knows - torek 2012-04-04 07:58
struct
-valued function was either: struct_instance = f(args);
or (void) f(args);
. C99 tries to make it clear that you can also select a struct
element and (subsequently) an array element, but not grab hold of a pointer to the entire array. This works right in gcc, but it's probably a good test for other compilers. (I'd guess the Plum-Hall test suite has a test like this by now. - torek 2012-04-04 18:04
String literals are arrays - objects of inherently unpredictable size (i.e of user-defined and possibly large size). In general case, there's simply no other way to represent such literals except as objects in memory, i.e. as lvalues
. In C99 this also applies to compound literals, which are also lvalues
.
Any attempts to artificially hide the fact that string literals are lvalues
at the language level would produce a considerable number of completely unnecessary difficulties, since the ability to point to a string literal with a pointer as well as the ability to access it as an array relies critically on its lvalue-ness being visible at the language level.
Meanwhile, literals of scalar types have fixed compile-time size. At the same time, such literals are very likely to be embedded directly into the machine commands on the given hardware architecture. For example, when you write something like i = i * 5 + 2
, the literal values 5
and 2
become explicit (or even implicit) parts of the generated machine code. They don't exist and don't need to exist as standalone locations in data storage. There's simply no point in storing values 5
and 2
in the data memory.
It is also worth noting that on many (if not most, or all) hardware architectures floating-point literals are actually implemented as "hidden" lvalues
(even though the language does not expose them as such). On platforms like x86 machine commands from floating-point group do not support embedded immediate operands. This means that virtually every floating-point literal has to be stored in (and read from) data memory by the compiler. E.g. when you write something like i = i * 5.5 + 2.1
it is translated into something like
const double unnamed_double_5_5 = 5.5;
const double unnamed_double_2_1 = 2.1;
i = i * unnamed_double_5_5 + unnamed_double_2_1;
In other words, floating-point literals
often end up becoming "unofficial" lvalues
internally. However, it makes perfect sense that language specification did not make any attempts to expose this implementation detail. At language level, arithmetic literals
make more sense as rvalues
.
'x'
or 5
in the source code are "swallowed" in the executable during the compilation and "become part of it", whereas memory is reserved for "x"
and 5.5
at runtime, so that they are created by the executable, stored in memory, but are not part of the executable file itself. Have I completely missed the point - Enrico Maria De Angelis 2018-11-27 17:05
x * 2.0
will usually compile as x+x
. That really emphasizes that the "hidden lvalue" thing is truly just an asm implementation detail, and not fundamental or even related to language rules. More of a fun fact, but yeah interesting to point out. (Although the as-if rule does even allow the compiler to modify string literals, e.g. turn printf("hello\n")
into puts("hello")
. - Peter Cordes 2019-02-13 14:13
An lvalue
in C++ does not always refer to an object. It can refer to a function too. Moreover, objects do not have to be referred to by lvalues
. They may be referred to by rvalues
, including for arrays (in C++ and C). However, in old C89, the array to pointer conversion did not apply for rvalues
arrays.
Now, an rvalue
denotes no, limited or soon to be an expired lifetime. A string literal, however, lives for the entire program.
So string literals
being lvalues
is exactly right.
I'd guess that the original motive was mainly a pragmatic one: a string
literal must reside in memory and have an address. The type of a string
literal is an array type (char[]
in C, char const[]
in C++), and
array types convert to pointers in most contexts. The language could
have found other ways to define this (e.g. a string literal could have
pointer type to begin with, with special rules concerning what it
pointed to), but just making the literal an lvalue is probably the
easiest way of defining what is concretely needed.
const
- torek 2012-04-04 08:30
noalias
; Ritchie's "noalias must go" response was grounded in both pragmatics and theory (he demonstrated that "noalias" was self-inconsistent) - torek 2012-04-04 09:04