\documentclass[12pt]{article}
\textwidth= 6.25in \textheight= 9.0in \topmargin = -10pt
\evensidemargin=10pt \oddsidemargin=10pt \headsep=25pt
\parskip=10pt
\font\smallit=cmti10 \font\smalltt=cmtt10 \font\smallrm=cmr9


         % Width of text line.
%\pagestyle{empty}
\newtheorem{theo}{Theorem}[section]
\newtheorem{lemma}[theo]{Lemma}
\newtheorem{coro}[theo]{Corollary}
\newtheorem{cor}[theo]{Corollary}      %%%JHK%%%%
\newtheorem{prop}[theo]{Proposition}
\newtheorem{claim}[theo]{Claim}
%\newcommand{\leqsim}{<<}
\newcommand{\HH}{{\cal H}}
\newcommand{\TT}{{\cal T}}
\newcommand{\old}[1]{}
\newcommand\raf[1]{(\ref{#1})}
\def\N {{\bf {N}}}
\def\Z{\mathbb{Z}}
%\def\N{\bf N}
%\def\Z{\bf Z}
\def\visit{\hbox{\rm visit}}
\def\E{{\sf E}}

\def\proofend{\hfill$\square$\medskip}
\def\beq{\begin{equation}}
\def\eqe{\end{equation}}
\def\begarray{\begin{eqnarray*}}
\def\endarray{\end{eqnarray*}}
\def\Proof{\medskip\noindent{\bf Proof. }}
\renewcommand{\Pr}{{\sf P}}
\def\bn{\bf \noindent}
\def\ep{\varepsilon}
\def\th{\theta}
\def\BM{{\mathbf M}}
\def\BB{{\mathbb B}}
\def\bC{{\mathbf C}}
\def\bV{{\mathbf V}}
\def\baY{\overline Y}
\def\OM{{\mathbf \Omega}}
\def\Bm{\mathbf m}
\def\tk{\tilde k}
\def\CM{{\mathcal M}}
\def\CN{{\mathcal N}}
\def\CI{{\mathcal I}}
\def\CF{{\mathcal F}}
\def\CE{{\mathcal E}}
\def\CU{{\mathcal U}}
\def\CP{{\mathcal P}}
\def\bx{{\mathbf x}}
\def\Bc{{\mathbf c}}
\def\da{d\alpha}
\def\dg{d\gamma}
%\def{\bf E}{ {\bf E}}
\def\iI{\int_I}
\def\OO {O^{\ast}}
\def\hs{\hfill $\square$}
\def\dd{\partial}
\def\PE{ P. Erd\H os}
\def\bO{{\mathbf \Omega}}

%\newcommand{\corr}[1]{{\bf #1}}
\newcommand{\corr}[1]{}


\begin{document}

\begin{center}


{\bf HIGH ORDER COMPLEMENTARY BASES OF PRIMES}

{\bf V. H. Vu \footnote{Research supported in part by grant
RB091G-VU from UCSD, by  NSF grant DMS-0200357, and by a Sloan
fellowship. 
\vskip 5pt
\noindent AMS Subject Classificiation Numbers: 05, 11.

\noindent  Key words: Complementary bases, the probabilistic method,
primes.  }}\\
{\smallit Department of Mathematics, University of California at San Diego \\ La Jolla, CA 92093, USA}\\
{\tt vanvu@ucsd.edu}\\  \vskip 10pt

\end{center}


\vskip 30pt \centerline{\smallit Received:  3/24/02, Accepted: 10/22/02,
Published:  10/23/02
}
\vskip 30pt

\centerline{\bf Abstract}


%\begin{abstract}
\noindent
We show that there is a set $X \subset {\bf N} $ with density
$O(\log n)$ such that every sufficiently large natural number can
be represented as sum of two elements from $X$ and a prime. The
density is a $\log ^{1/2} n$ factor off being best possible.
%\end{abstract}


\pagestyle{myheadings} \markright{\smalltt INTEGERS: \smallrm
ELECTRONIC JOURNAL OF COMBINATORIAL NUMBER THEORY
\smalltt 2 (2002),
\#A12\hfill }

\thispagestyle{empty} \baselineskip=15pt \vskip 30pt


% DIMENSION OF TEXT:

%\renewcommand{\baselinestretch}{1.2}
%\topmargin 0pt \headsep 0pt


\section*{\normalsize 1. Introduction}
\addtocounter{section}{+1}

A central notion in additive number theory is that of bases. A
subset $X$ of ${\bf N} $ is a basis of order $k$ if every
sufficiently large number $n \in {\bf N} $ can be represented as a
sum of $k$ elements of $X$. Here and later ${\bf N} $ denotes the
set of natural numbers.

In this note, we are working  with a related notion of
complementary bases. Given a set $A \subset {\bf N} $, a set $X
\subset {\bf N} $ is a complementary basis of order $k$ of $A$ if
every sufficiently large natural number can be written as a sum of
an element in $A$ and $k$ elements in $X$.  All asymptotic
notations are used under the assumption that $n \rightarrow
\infty$. The logarithms have natural base.


Consider the set $P$ of primes. Since $P$ has density $n/ \log n$,
it is clear by the pigeon hole principle that a complementary
basis of order $k$ of $P$ should have density $\Omega (\log^{1/k}
n)$. As far as we know, it is still an open question to determine,
even for the case $k=1$, that whether there is a complementary
basis of $P$ with density $O(\log^{1/k} n)$. In \cite{Erd}, Erd\H
os shown that for $k=1$, there is a complementary base of density
$O(\log^2 n)$. In a recent paper,  Ruzsa \cite{Ruz}, improving a
result of  Kolountzakis \cite{Kol}, showed that  there exists a
set $X$ of density  $\omega(n) \log n$,  where $\omega (n)$ is a
function tending to infinity arbitrarily slow in $n$, such that
the set $X+P$ has density one (i.e., almost all natural numbers
can be represented as a sum of an element from $X$ and a prime).

In this note, we focus on the case $k = 2$. Our main theorem is


\begin{theo}\label{vu} $P$ has a complementary basis of order 2
and density $O(\log  n)$.

\end{theo}

\begin{cor}\label{vu1} For all $k \ge 2$,
$P$ has a complementary basis of order $k$
and density $O(\log  n)$.

\end{cor}

\noindent We conjecture that


\vskip 1mm  \noindent {\bf Conjecture.} {\it  For any fixed $k$,
$P$ has a complementary basis of order k and density
$O(\log^{1+o(1))/k} n)$. }

\vskip 1mm

The probabilistic method is the  only  effective approach so far
to this type of problems. It seems that to prove even the density
$O(\log^{2/k} n)$ for $k \ge 3$ requires a new tool from
probability theory. Such a tool should surely be of independent
interest. The first step towards the conjecture might be to prove
a bound with the exponent decreasing in $k$.

\section*{\normalsize 2. Proof  of Theorem 1.1}
\addtocounter{section}{+1}

We construct the claimed  complementary basis $X$ by the  random
method. For each $x \in {\bf N} $, choose $x$ to be in $X$ with
probability $p_x = \Bc/ x$, where $\Bc$ is a positive constant to
be determined. Let $t_x$ be the binary random variable
representing the choice of $x$ (thus $t_x=1$ with probability
$p_x$ and $0$ with probability $1-p_x$). We skip the fairly easy
proof of the fact that almost surely, $X(m)= O(\log m)$ for every
$m$, i.e, $X$ has the right density; the interested readers might
want to consider this as a warm-up exercise. Now let us consider
the number representations of $n$ as sum of a prime and two
elements from $X$. This number is a random variable depending on
the $t_j$'s, $j < n$  and  can be expressed as follows

$$Y_n = \sum_{p < n} \,\,\, \sum_{i+ j= n-p} t_i t_j , $$

\noindent where in the  second sum we do not count permutations.
Here and later in a sum over $p$ we understand that $p$ is a
prime.  We next show that there is a constant $n_0$ such that with
probability at least $1/2$, $Y _n \ge 1$ for all $n \ge n_0$. To
achieve this goal, it suffices to prove that for all sufficiently
large $n$, $Pr (Y_n =0) \le n^{-2}$ (notice that the sum $\sum_{n=
n_0}^{\infty} n^{-2}$ goes to 0 as $n_0$ tends to infinity). It
has turned out that it is much more convenient to work with the
following truncation of $Y_n$

$$Y'_n =  \sum_{p \le n - 2n^{2/3} }\,\,  \sum_{  i+ j =n-p; \,\,\, i,j
\ge n^{2/3 } } t_i t_j.  $$

\noindent In the following, we shall prove


 \begin{equation} \label {Pro1} Pr (Y'=0) \le n^{-2}. \end{equation}

\noindent  Central to the proof of (\ref{Pro1}) is the following

 \begin{lemma} \label{sum} For all sufficient large $n$
 $$\sum_{ p \le  n- n^{2/3} } \frac{1}{
 n-p} =  \Theta(1), $$

 \noindent and

  $$\sum_{ p \le  n- 2n^{2/3}} \frac{1}{
 n-p} =  \Theta(1). $$
 \end{lemma}

 {\bf \noindent  Proof of Lemma 2.3.}
To verify the first equality, let us set  $n_1 = n -  n^{2/3}$ and
$n_l= n_{l-1} - n_{l-1}^{2/3}$ for all $l=2,3,4, \dots, s$ where
$s$ is the first place where $n_s \le n/2$. It is a routine to
show by induction that

\begin{equation} \label{NI}
n - \frac{l n^{2/3}}{2 }\ge n_l \ge n - ln^{2/3}.
\end{equation}

Let $P_l$  denote the set of primes in the interval $[n_l,
n_{l-1})$. It is clear, by (\ref{NI}), that for all $p\in P_l$,

\begin{equation} \label{bound}
\frac{2}{l n^{2/3} } \ge \frac{1}{n-p} \ge \frac{1}{ l n^{2/3}}.
\end{equation}

On the other hand, it is a well-known fact in number theory that
the number of primes in the interval $[m- m^{2/3}, m)$ is $\Theta
(m^{2/3}/ \log m)$ for every sufficiently large $m$ (see, for
instance, \cite{HR}). Thus

$$|P_l| = \Theta (n_{l-1}^{2/3} / \log n_{l-1}) =  \Theta
(n^{2/3} / \log n) $$

\noindent  for all $l$. This and (\ref{bound}) yield

\begin{equation}
\sum_{l=2}^s \,\, \sum_{p \in P_l} \frac{1}{n-p} = \Theta
(\log^{-1}n \sum_{l=2}^s 1/l ) = \Theta (\log^{-1}n \times \log s)
=\Theta (1).
\end{equation}

 To complete the
proof of the first equality,  notice that $\sum_{p \ge n/2}
1/(n-p) \le \sum_{j=n/2}^n 1/j = O(\frac{n}{\log n}
\frac{1}{n})=o(1)$.  The second equality follows easily. \hfill
$\mathcal{Q.E.D}$

\vskip 2mm

The second main ingredient of our proof is a large deviation
bound, due to Janson \cite{Jan}. Let $z_1, \dots, z_m$ be
independent indicator random variables and consider a random
variable $Y= \sum_{\alpha} I_{\alpha}$ where each $I_{\alpha}$ is
the product of few $z_j$'s. We write $\alpha \sim \beta$ if there
is some $z_j$ which occurs in both $I_{\alpha}$ and $I_{\beta}$.
Furthermore, set $\Delta =  \sum_{\alpha \sim \beta} {\bf E}
(I_{\alpha} I_{\beta}) $. To this end, $ {\bf E} (A)$ denotes the
expectation of the random variable $A$.

\begin{theo}\label{Janson} (Janson) For any $Y$ as above and any positive
number $\ep $

$$Pr (Y \le (1-\ep) {\bf E} (Y)) \le e^{-\frac{ (\ep {\bf E}(Y))^2}{2({\bf E}
(Y) + \Delta)} }. $$

\end{theo}

From Theorem \ref{Janson}, one can routinely derive the following
lemma. For a pair $1 \le i,j \le m$, we  write $i \sim j$ if
there is some $\alpha$ such that $I_{\alpha}$ contains both $z_i$
and $z_j$.

\begin{lemma}  \label{janson}

 There is a positive constant  $r$ such that
 the following holds. If each term  $I_{\alpha}$ in $Y$ is the product of
exactly 2 random variables and for all $ m \ge i \ge 1$, $ {\bf E}
(Y) \ge r \log n \Big( {\bf E} (\sum_{j \sim i} t_j ) + 1 \Big)$,
then

$$Pr (Y=0) \le n^{-2}. $$

\end{lemma}

We now apply Lemma \ref{janson} to $Y'_n$. Let us notice that in
our setting, $i \sim j$ if and only if there is a prime number $p$
such that $i+j +p=n$. Therefore,

 $$ {\bf E} (\sum_{j \sim i} t_j ) = \sum_{p \le n -i -n^{2/3}}
 \frac{\Bc}{n-i-p} \le \sum_{ p \le m - m^{2/3} } \frac{\Bc}{
 m-p}, $$

\noindent where $m=n-i$. By Lemma \ref{sum}, there is a constant
$a$ such that ${\bf E} (\sum_{j \sim i} t_j ) \le a \Bc$.
Moreover, a simple calculation yields that

 $${\bf E} (\sum_{i+ j= n-p, i,j \ge
n^{2/3 }
 } t_i t_j) = \Omega ( \Bc ^2 \frac{\log (n-p)}{n-p}). $$

\noindent This and  Lemma \ref{sum} together imply

 \begin{equation} \label{expprime} {\bf E} (Y') =
 \Omega (\sum_{ p \le n- 2n^{2/3} } \frac{\log
 (n-p)}{n-p}) = \Omega (\Bc^2 \log n \sum _{ p \le n- 2n^{2/3} }
 \frac{1}{n-p}) = \Omega (\Bc^2 \log n). \end{equation}

(\ref{expprime}) guarantees that there is a positive constant $b$
such that ${\bf E}(Y') \ge b\Bc^2 \log n $. Thus, by increasing
$\Bc$, we can guarantee that the condition of Lemma \ref{janson}
is met and this completes the proof.  \hfill $\mathcal{Q.E.D}$

\noindent
{\it Acknowledgement.} The author would like to thank Andrew Granville for
a correction concerning the result of Ruzsa  quoted in the introduction.

\begin{thebibliography}{99}


\bibitem{Erd} \PE, { Problems and results in additive number
theory,} {\it Colloque sur la Th\'eory de Nombre (CBRM)}
(Bruxelles, 1956), 127-137

\bibitem{HR} H. Halberstam and K. F. Roth, Sequences, Springer-Verlag,
 New York, 1983.

\bibitem {Jan}  S. Janson,  { Poisson approximation for large deviations},
{\it Random Structures and Algorithms,}  1 (1990), 221-230.

\bibitem {Kol}  M. Kolountzakis,  On the additive complements
 of the primes and sets of similar growth, {\it Acta Arith.} 77 (1996), no. 1, 1--8.

\bibitem {Ruz} I. Ruzsa,  On the additive completion of primes.
{\it Acta Arith. 86} (1998), no. 3, 269--275.

\end{thebibliography}


\end{document}